epub-parser 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5064ac374ce37d6547455f140004939cc0fb6beb
4
- data.tar.gz: 210f062f461f4ae37f84709953e1f1f029ddba44
3
+ metadata.gz: db9a8abb6d8ece39b51e5432c33caffdbd5efc65
4
+ data.tar.gz: bd9eb66c3af2ad86c1def37514c60db5d687a26d
5
5
  SHA512:
6
- metadata.gz: 726184ddb137eaceed9774eb81bcf3ea36b9896495b247a115dbc89d12f628af2ccc7d0f8abb87baec30e8fa44c0bcd4f40b0175297f5e21922d7f3696c58628
7
- data.tar.gz: 0d5c3e0d8a4afc8410088ef77daeb59d28f37caf1a349ddfd4eeeb693d0364dce61a19e3cc69c6d79032f51fc3b24c8b20fb5303f389aa6e6f975111edf983d9
6
+ metadata.gz: ab94d4c2c5104e966d6f2ed8368491189ca5a215a687bd6fdd2a43424b02935c44c36f8b14f39514858d2ad92c283e6ae7780041339f28f63dc283e1c8f72c30
7
+ data.tar.gz: 4d8c2f4f3a8dc584386cf55a4a6be890856c466b9ae1e7c2d6dfbe4cf31362730b1cbadf89c125910f9ab86ee27ccd7776fa099efefbeee4727698e140629cac
data/.gitignore CHANGED
@@ -7,6 +7,8 @@ vendor/*
7
7
  coverage/*
8
8
  samples/*
9
9
  doc/*
10
+ html/*
10
11
  NOTE
11
12
  test/fixtures/book.epub
12
13
  *~
14
+ lib/epub/parser/cfi.tab.rb
@@ -1,4 +1,4 @@
1
1
  rvm:
2
- - "2.0.0"
3
- - "2.1.0"
4
- - "2.2.0"
2
+ - "2.1.8"
3
+ - "2.2.4"
4
+ - "2.3.0"
data/.yardopts CHANGED
@@ -11,3 +11,4 @@ docs/Navigation.markdown
11
11
  docs/Searcher.markdown
12
12
  docs/UnpackedArchive.markdown
13
13
  docs/AggregateContentsFromWeb.markdown
14
+ examples/aggregate-contents-from-web.rb
@@ -1,6 +1,18 @@
1
1
  CHANGELOG
2
2
  =========
3
3
 
4
+ 0.2.3
5
+ -----
6
+
7
+ * Change the name of physical container adapter for file system: :File -> :UnpackedDirectory
8
+ * Add `EPUB::Publication::Package::Manifest::Item#full_path`
9
+ * Make #href= acceptable String
10
+ * Implement `EPUB::CFI` and `EPUB::Parser::CFI`
11
+ * Remove [nokogumbo][] from dependencies. It ommits `head` and `body` elements
12
+ * Remove Cucumber and Cucumber features
13
+ * Add `EPUB::Publication::Package::Metadata#modified` and `EPUB::Book::Features#modified`
14
+ * Add `EPUB::Book::Features#release_identifier`
15
+
4
16
  0.2.2
5
17
  -----
6
18
 
@@ -2,6 +2,7 @@ EPUB Parser
2
2
  ===========
3
3
  [![Build Status](https://secure.travis-ci.org/KitaitiMakoto/epub-parser.png?branch=master)](http://travis-ci.org/KitaitiMakoto/epub-parser)
4
4
  [![Dependency Status](https://gemnasium.com/KitaitiMakoto/epub-parser.png)](https://gemnasium.com/KitaitiMakoto/epub-parser)
5
+ [![Gem Version](https://badge.fury.io/rb/epub-parser.svg)](http://badge.fury.io/rb/epub-parser)
5
6
 
6
7
  INSTALLATION
7
8
  -------
@@ -95,7 +96,7 @@ DOCUMENTATION
95
96
 
96
97
  Documentation is available in [homepage][].
97
98
 
98
- If you installed EPUB Parser by gem command, you can also generate documentaiton by your own([rubygems-yardoc][] gem is needed):
99
+ If you installed EPUB Parser by gem command, you can also generate documentaiton yourself([rubygems-yardoc][] gem is needed):
99
100
 
100
101
  $ gem install epub-parser
101
102
  $ gem yardoc epub-parser
@@ -150,6 +151,19 @@ If you find other gems, please tell me or request a pull request.
150
151
  RECENT CHANGES
151
152
  --------------
152
153
 
154
+ ### 0.2.3
155
+
156
+ * Change the name of physical container adapter for file system: :File -> :UnpackedDirectory
157
+ * Add `EPUB::Publication::Package::Manifest::Item#full_path`
158
+ * Make #href= acceptable String
159
+ * Implement `EPUB::CFI` and `EPUB::Parser::CFI`
160
+ * Remove [nokogumbo][] from dependencies. It ommits `head` and `body` elements
161
+ * Remove Cucumber and Cucumber features
162
+ * Add `EPUB::Publication::Package::Metadata#modified` and `EPUB::Book::Features#modified`
163
+ * Add `EPUB::Book::Features#release_identifier`
164
+
165
+ [nokogumbo]: https://github.com/rubys/nokogumbo/
166
+
153
167
  ### 0.2.2
154
168
 
155
169
  * [BUGFIX]Item#entry_name returns normalized IRI
@@ -167,16 +181,6 @@ RECENT CHANGES
167
181
 
168
182
  * Make it possible to parse file system directory as an EPUB file. See {file:docs/UnpackedArchive.markdown} for details.
169
183
 
170
- ### 0.1.9
171
-
172
- * Introduce [Nokogumbo][] for XHTML Content Documents
173
- * Stop support for Ruby 1.9
174
- * Remove `EPUB.included` method. Now including `EPUB` module empowers nothing of EPUB features. Include `EPUB::Book::Features` instead.
175
- * Add `EPUB::Searcher::XHTML::Seamless` and make it default searcher
176
- * Add `EPUB::Publication::Package::Manifest#each_nav`
177
-
178
- [nokogumbo]: https://github.com/rubys/nokogumbo/
179
-
180
184
  See {file:CHANGELOG.markdown} for older changelogs and details.
181
185
 
182
186
  TODOS
data/Rakefile CHANGED
@@ -3,22 +3,28 @@ require 'rake/clean'
3
3
  require 'rake/testtask'
4
4
  require 'yard'
5
5
  require 'rdoc/task'
6
- require 'cucumber'
7
- require 'cucumber/rake/task'
8
6
  require 'epub/parser/version'
9
7
  require 'zipruby'
10
8
 
9
+ CFI_TAB = 'lib/epub/parser/cfi.tab.rb'
10
+ CFI_Y = 'lib/epub/parser/cfi.y'
11
+ CLEAN.include(CFI_TAB)
12
+
11
13
  task :default => :test
12
14
  task :test => 'test:default'
13
15
 
16
+ file CFI_TAB do
17
+ sh "racc #{CFI_Y}"
18
+ end
19
+
14
20
  namespace :test do
15
21
  task :default => [:build, :test]
16
22
 
17
23
  desc 'Run all tests'
18
- task :all => [:build, :test, :cucumber]
24
+ task :all => [:build, :test]
19
25
 
20
26
  desc 'Build test fixture EPUB file'
21
- task :build => :clean do
27
+ task :build => [:clean, CFI_TAB] do
22
28
  input_dir = 'test/fixtures/book'
23
29
  sh "epzip #{input_dir}"
24
30
  small_file = File.read("#{input_dir}/OPS/case-sensitive.xhtml")
@@ -32,8 +38,6 @@ namespace :test do
32
38
  task.warning = true
33
39
  task.options = '--no-show-detail-immediately --verbose'
34
40
  end
35
-
36
- Cucumber::Rake::Task.new
37
41
  end
38
42
 
39
43
  task :doc => 'doc:default'
@@ -52,4 +56,5 @@ end
52
56
 
53
57
  namespace :gem do
54
58
  Bundler::GemHelper.install_tasks
59
+ task :build => [:clean, CFI_TAB]
55
60
  end
@@ -21,7 +21,7 @@ EOB
21
21
  $0 = File.basename($PROGRAM_NAME)
22
22
  include EPUB::Book::Features
23
23
  file = ARGV.shift
24
- EPUB::OCF::PhysicalContainer.adapter = :File if File.directory? file
24
+ EPUB::OCF::PhysicalContainer.adapter = :UnpackedDirectory if File.directory? file
25
25
  unless File.readable? file
26
26
  uri = URI.parse(file) rescue nil
27
27
  if uri
@@ -30,7 +30,7 @@ unless file
30
30
  abort
31
31
  end
32
32
 
33
- EPUB::OCF::PhysicalContainer.adapter = :File if File.directory? file
33
+ EPUB::OCF::PhysicalContainer.adapter = :UnpackedDirectory if File.directory? file
34
34
  unless File.readable? file
35
35
  uri = URI.parse(file) rescue nil
36
36
  if uri
@@ -21,7 +21,7 @@ EPUB Parser can treat the URI as EPUB book file path and parse contents from it
21
21
  The trick is to set {EPUB::OCF::PhysicalContainer.adapter container adapter} to {EPUB::OCF::PhysicalContainer::UnpackedURI :UnpackedURI}. It makes it possible to parse EPUB book from the web.
22
22
  Now we can play with EPUB books as always!
23
23
 
24
- As an example, I will show you a script to download all the files of specified EPUB book to local directory(source code is available in repository's examples/aggregate-contents-from-web.rb).
24
+ As an example, I will show you a script to download all the files of specified EPUB book to local directory(source code is available in repository's {file:examples/aggregate-contents-from-web.rb}).
25
25
 
26
26
  {include:file:examples/aggregate-contents-from-web.rb}
27
27
 
@@ -45,7 +45,7 @@ To load EPUB books from directory, you need specify file adapter via {EPUB::OCF:
45
45
 
46
46
  require 'epub/parser'
47
47
 
48
- EPUB::OCF::PhysicalContainer.adapter = :File
48
+ EPUB::OCF::PhysicalContainer.adapter = :UnpackedDirectory
49
49
 
50
50
  And then, directory path as EPUB path:
51
51
 
@@ -67,7 +67,7 @@ If set {EPUB::OCF::PhysicalContainer.adapter}, it is used every time EPUB Parser
67
67
  archived_book = EPUB::Parser.parse('./page-blanche.epub') # => EPUB::Book
68
68
  # From directory
69
69
  File.ftype './page-blanche' # => "directory"
70
- unpacked_book = EPUB::Parser.parse('./page-blanche', container_adapter: :File) # => EPUB::Book
70
+ unpacked_book = EPUB::Parser.parse('./page-blanche', container_adapter: :UnpackedDirectory) # => EPUB::Book
71
71
 
72
72
  Command-line tools
73
73
  ------------------
@@ -14,6 +14,7 @@ Gem::Specification.new do |s|
14
14
  s.required_ruby_version = '> 2'
15
15
 
16
16
  s.files = `git ls-files`.split("\n")
17
+ .push('lib/epub/parser/cfi.tab.rb')
17
18
  .push('test/fixtures/book/OPS/ルートファイル.opf')
18
19
  .push('test/fixtures/book/OPS/日本語.xhtml')
19
20
  .push(Dir['docs/*.md'])
@@ -37,11 +38,11 @@ Gem::Specification.new do |s|
37
38
  s.add_development_dependency 'gem-man'
38
39
  s.add_development_dependency 'ronn'
39
40
  s.add_development_dependency 'epzip'
40
- s.add_development_dependency 'aruba'
41
+ s.add_development_dependency 'racc'
42
+ s.add_development_dependency 'nokogiri-diff'
41
43
 
42
44
  s.add_runtime_dependency 'zipruby'
43
45
  s.add_runtime_dependency 'nokogiri', '~> 1.6'
44
- s.add_runtime_dependency 'nokogumbo'
45
46
  s.add_runtime_dependency 'addressable', '>= 2.3.5'
46
47
  s.add_runtime_dependency 'rchardet', '>= 1.6.1'
47
48
  end
@@ -17,7 +17,7 @@ module EPUB
17
17
  end
18
18
  end
19
19
 
20
- %w[title main_title subtitle short_title collection_title edition_title extended_title description date unique_identifier].each do |met|
20
+ %w[title main_title subtitle short_title collection_title edition_title extended_title description date unique_identifier modified].each do |met|
21
21
  define_method met do
22
22
  metadata.__send__(met)
23
23
  end
@@ -29,6 +29,10 @@ module EPUB
29
29
  end
30
30
  end
31
31
 
32
+ def release_identifier
33
+ "#{unique_identifier}@#{modified}"
34
+ end
35
+
32
36
  def container_adapter
33
37
  @adapter || OCF::PhysicalContainer.adapter
34
38
  end
@@ -0,0 +1,301 @@
1
+ module EPUB
2
+ module CFI
3
+ SPECIAL_CHARS = '^[](),;=' # "5E", "5B", "5D", "28", "29", "2C", "3B", "3D"
4
+ RE_ESCAPED_SPECIAL_CHARS = Regexp.escape(SPECIAL_CHARS)
5
+
6
+ class << self
7
+ def escape(string)
8
+ string.gsub(/([#{RE_ESCAPED_SPECIAL_CHARS}])/o, '^\1')
9
+ end
10
+
11
+ def unescape(string)
12
+ string.gsub(/\^([#{RE_ESCAPED_SPECIAL_CHARS}])/o, '\1')
13
+ end
14
+ end
15
+
16
+ class Location
17
+ attr_reader :paths
18
+
19
+ def initialize(paths=[])
20
+ @paths = paths
21
+ end
22
+
23
+ def initialize_copy(original)
24
+ @paths = original.paths.collect(&:dup)
25
+ end
26
+
27
+ def type
28
+ @paths.last.type
29
+ end
30
+
31
+ def <=>(other)
32
+ index = 0
33
+ other_paths = other.paths
34
+ cmp = nil
35
+ paths.each do |path|
36
+ other_path = other_paths[index]
37
+ return 1 unless other_path
38
+ cmp = path <=> other_path
39
+ break unless cmp == 0
40
+ index += 1
41
+ end
42
+
43
+ unless cmp == 0
44
+ if cmp == 1 and other_paths[index + 1]
45
+ return nil
46
+ else
47
+ return cmp
48
+ end
49
+ end
50
+
51
+ return nil if paths.last.offset && other_paths[index]
52
+
53
+ return -1 if other_paths[index]
54
+
55
+ 0
56
+ end
57
+
58
+ def to_s
59
+ paths.join('!')
60
+ end
61
+
62
+ def to_fragment
63
+ "epubcfi(#{self})"
64
+ end
65
+
66
+ def join(*other_paths)
67
+ new_paths = paths.dup
68
+ other_paths.each do |path|
69
+ new_paths << path
70
+ end
71
+ self.class.new(new_paths)
72
+ end
73
+ end
74
+
75
+ class Path
76
+ attr_reader :steps, :offset
77
+
78
+ def initialize(steps=[], offset=nil)
79
+ @steps, @offset = steps, offset
80
+ end
81
+
82
+ def initialize_copy(original)
83
+ @steps = original.steps.collect(&:dup)
84
+ @offset = original.offset.dup if original.offset
85
+ end
86
+
87
+ def to_s
88
+ @string_cache ||= (steps.join + offset.to_s).freeze
89
+ end
90
+
91
+ def to_fragment
92
+ @fragment_cache ||= "epubcfi(#{self})".freeze
93
+ end
94
+
95
+ def <=>(other)
96
+ other_steps = other.steps
97
+ index = 0
98
+ steps.each do |step|
99
+ other_step = other_steps[index]
100
+ return 1 unless other_step
101
+ cmp = step <=> other_step
102
+ return cmp unless cmp == 0
103
+ index += 1
104
+ end
105
+
106
+ return -1 if other_steps[index]
107
+
108
+ other_offset = other.offset
109
+ if offset
110
+ if other_offset
111
+ offset <=> other_offset
112
+ else
113
+ 1
114
+ end
115
+ else
116
+ if other_offset
117
+ -1
118
+ else
119
+ 0
120
+ end
121
+ end
122
+ end
123
+
124
+ def each_step_with_instruction
125
+ yield [step, nil]
126
+ local_path.each_step_with_instruction do |s, instruction|
127
+ yield [s, instruction]
128
+ end
129
+ self
130
+ end
131
+ end
132
+
133
+ class Range < ::Range
134
+ attr_accessor :parent, :start, :end
135
+
136
+ # @todo consider the case subpaths are redirected path
137
+ # @todo FIXME: too dirty
138
+ class << self
139
+ def from_parent_and_start_and_end(parent_path, start_subpath, end_subpath)
140
+ start_str = start_subpath.join
141
+ end_str = end_subpath.join
142
+
143
+ first_paths = parent_path.collect(&:dup)
144
+ if start_subpath
145
+ offset_of_first = start_subpath.last.offset
146
+ offset_of_first = offset_of_first.dup if offset_of_first
147
+ last_of_first_paths = first_paths.pop
148
+ first_paths << last_of_first_paths
149
+ last_of_first_paths.steps.concat start_subpath.shift.steps
150
+ first_paths.concat start_subpath
151
+ first_paths.last.instance_variable_set :@offset, offset_of_first
152
+ end
153
+ offset_of_last = end_subpath.last.offset
154
+ offset_of_last = offset_of_last.dup if offset_of_last
155
+ last_paths = parent_path.collect(&:dup)
156
+ last_of_last_paths = last_paths.pop
157
+ last_paths << last_of_last_paths
158
+ last_of_last_paths.steps.concat end_subpath.shift.steps
159
+ last_paths.concat end_subpath
160
+ last_paths.last.instance_variable_set :@offset, offset_of_last
161
+
162
+ first = CFI::Location.new(first_paths)
163
+ last = CFI::Location.new(last_paths)
164
+
165
+ new_range = new(first, last)
166
+
167
+ new_range.parent = Location.new(parent_path)
168
+ new_range.start = start_str
169
+ new_range.end = end_str
170
+
171
+ new_range
172
+ end
173
+ end
174
+
175
+ def to_s
176
+ @string_cache ||= (first.to_fragment + (exclude_end? ? '...' : '..') + last.to_fragment).freeze
177
+ end
178
+
179
+ def to_fragment
180
+ @fragment_cache ||= "epubcfi(#{@parent},#{@start},#{@end})".freeze
181
+ end
182
+ end
183
+
184
+ class Step
185
+ attr_reader :step, :assertion
186
+
187
+ def initialize(step, assertion=nil)
188
+ @step, @assertion = step, assertion
189
+ @string_cache = nil
190
+ end
191
+
192
+ def initialize_copy(original)
193
+ @step = original.step
194
+ @assertion = original.assertion.dup if original.assertion
195
+ end
196
+
197
+ def to_s
198
+ @string_cache ||= "/#{step}#{assertion}".freeze # need escape?
199
+ end
200
+
201
+ def <=>(other)
202
+ step <=> other.step
203
+ end
204
+ end
205
+
206
+ class IDAssertion
207
+ attr_reader :id, :parameters
208
+
209
+ def initialize(id, parameters={})
210
+ @id, @parameters = id, parameters
211
+ @string_cache = nil
212
+ end
213
+
214
+ def to_s
215
+ return @string_cache if @string_cache
216
+ @string_cache = '['
217
+ @string_cache << CFI.escape(id) if id
218
+ parameters.each_pair do |key, values|
219
+ value = values.join(',')
220
+ @string_cache << ";#{CFI.escape(key)}=#{CFI.escape(value)}"
221
+ end
222
+ @string_cache << ']'
223
+ @string_cache.freeze
224
+ end
225
+ end
226
+
227
+ class TextLocationAssertion
228
+ attr_reader :preceded, :followed, :parameters
229
+
230
+ def initialize(preceded=nil, followed=nil, parameters={})
231
+ @preceded, @followed, @parameters = preceded, followed, parameters
232
+ @string_cache = nil
233
+ end
234
+
235
+ def to_s
236
+ return @string_cache if @string_cache
237
+ @string_cache = '['
238
+ @string_cache << CFI.escape(preceded) if preceded
239
+ @string_cache << ',' << CFI.escape(followed) if followed
240
+ parameters.each_pair do |key, values|
241
+ value = values.join(',')
242
+ @string_cache << ";#{CFI.escape(key)}=#{CFI.escape(value)}"
243
+ end
244
+ @string_cache << ']'
245
+ @string_cache.freeze
246
+ end
247
+ end
248
+
249
+ class CharacterOffset
250
+ attr_reader :offset, :assertion
251
+
252
+ def initialize(offset, assertion=nil)
253
+ @offset, @assertion = offset, assertion
254
+ @string_cache = nil
255
+ end
256
+
257
+ def to_s
258
+ @string_cache ||= ":#{offset}#{assertion}".freeze # need escape?
259
+ end
260
+
261
+ def <=>(other)
262
+ offset <=> other.offset
263
+ end
264
+ end
265
+
266
+ class TemporalSpatialOffset
267
+ attr_reader :temporal, :x, :y, :assertion
268
+
269
+ def initialize(temporal=nil, x=nil, y=nil, assertion=nil)
270
+ raise RangeError, "dimension must be in 0..100 but passed #{x}" unless (0.0..100.0).cover?(x) if x
271
+ raise RangeError, "dimension must be in 0..100 but passed #{y}" unless (0.0..100.0).cover?(y) if y
272
+ warn "Assertion is passed to #{__class__} but cannot know how to handle with it: #{assertion}" if assertion
273
+ @temporal, @x, @y, @assertion = temporal, x, y, assertion
274
+ @string_cache
275
+ end
276
+
277
+ def to_s
278
+ return @string_cache if @string_cache
279
+ @string_cache = ''
280
+ @string_cache << "~#{temporal}" if temporal
281
+ @string_cache << "@#{x}:#{y}" if x or y
282
+ @string_cache.freeze
283
+ end
284
+
285
+ # @note should split the class to spatial offset and temporal-spatial offset?
286
+ def <=>(other)
287
+ return -1 if temporal.nil? and other.temporal
288
+ return 1 if temporal and other.temporal.nil?
289
+ cmp = temporal <=> other.temporal
290
+ return cmp unless cmp == 0
291
+ return -1 if y.nil? and other.y
292
+ return 1 if y and other.y.nil?
293
+ cmp = y <=> other.y
294
+ return cmp unless cmp == 0
295
+ return -1 if x.nil? and other.x
296
+ return 1 if x and other.x.nil?
297
+ cmp = x <=> other.x
298
+ end
299
+ end
300
+ end
301
+ end