epub-parser 0.2.5 → 0.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ccf4841189745cdaa96e82b64a3493e4a3dc2421
4
- data.tar.gz: e2c83bc88d36a63957b313136372568506377ae3
3
+ metadata.gz: 7b9ee0e48ab95b16264f66048d2f0d040f63ab2d
4
+ data.tar.gz: b16bd608fc1c7c54de24a30c13ad880cd9f096de
5
5
  SHA512:
6
- metadata.gz: e635c138b09d454c17fd8f93b59b04403e828c73a4a0be14dc7d133501bc219a6d41a7d1108091b4bc845276ab33e7905d5ba7caed9403904cb6a71058c59dcf
7
- data.tar.gz: 12fe31458bfcfd89e8dd6fee555473f02370c613893ee681b2a7d9b6d2db70640bdb1403140300057016c4595976d678509ab812de070e25b63b8e6dac17ee70
6
+ metadata.gz: 6df7c1519d379afe93635e085ae7c3d9548ac9005a7e6cb79066db2310deca63ac8d0089100516a3657e6a16f092c3472897e5b10c227dd60b0c5db5e028f570
7
+ data.tar.gz: 80f34c61d4043beba1ca680db124744a1c619c126e19b097d2b1ba4ac0cb0439efdb99ebd9c183119f8d5a210bfde7408f234b4bf25678a36b4401cf842f37a6
data/.travis.yml CHANGED
@@ -1,4 +1,4 @@
1
1
  rvm:
2
- - "2.1.8"
3
- - "2.2.4"
4
- - "2.3.0"
2
+ - "2.1.10"
3
+ - "2.2.5"
4
+ - "2.3.1"
data/.yardopts CHANGED
@@ -12,3 +12,5 @@ docs/Searcher.markdown
12
12
  docs/UnpackedArchive.markdown
13
13
  docs/AggregateContentsFromWeb.markdown
14
14
  examples/aggregate-contents-from-web.rb
15
+ examples/exctract-content-using-cfi.rb
16
+ examples/find-elements-and-cfis.rb
data/CHANGELOG.markdown CHANGED
@@ -1,12 +1,25 @@
1
1
  CHANGELOG
2
2
  =========
3
3
 
4
+ 0.2.6
5
+ -----
6
+
7
+ * Add `EPUB::Publication::Package::Metadata#package_identifier` as alias of `#release_identifier`, which is defined in EPUB Publication 3.0 spec
8
+ * [BUG FIX]Metadata#modified returns modified with no refiners
9
+ * Make second argument for `EPUB::Parser::Publication.new` deprecated
10
+ * Add META-INF/metadata.xml support defined in [EPUB Multiple-Rendition Publications 1.0][multi-rendition]
11
+ * Add `EPUB::Book::Features#packages` and `#default_rendition`
12
+ * [BUG FIX]Don't raise error when using `Zipruby` container adapter
13
+
14
+ [multi-rendition]: http://www.idpf.org/epub/renditions/multiple/
15
+
4
16
  0.2.5
5
17
  -----
6
18
 
7
- * [BUG FIX]RaiseDon't load Zip/Ruby if unneccessary
19
+ * [BUG FIX]Don't load Zip/Ruby if unneccessary
8
20
  * Raise error when PhysicalContainer::ArchiveZip fails find entry
9
21
  * Remove unused files in schemas directory
22
+ * Add `EPUB::CFI::PhysicalContainer.find_adapter`
10
23
 
11
24
  0.2.4
12
25
  -----
data/README.markdown CHANGED
@@ -135,7 +135,7 @@ REQUIREMENTS
135
135
  ------------
136
136
  * Ruby 2.1.0 or later
137
137
  * `patch` command to install Nokogiri
138
- * C compiler to compile Zip/Ruby and Nokogiri
138
+ * C compiler to compile Nokogiri
139
139
 
140
140
  Related Gems
141
141
  ------------
@@ -151,9 +151,21 @@ If you find other gems, please tell me or request a pull request.
151
151
  RECENT CHANGES
152
152
  --------------
153
153
 
154
+ ### 0.2.6
155
+
156
+ * Add `EPUB::Publication::Package::Metadata#package_identifier` as alias of `#release_identifier`
157
+ * [BUG FIX]Metadata#modified returns modified with no refiners
158
+ * Make second argument for `EPUB::Parser::Publication.new` deprecated
159
+ * Add META-INF/metadata.xml support defined in [EPUB Multiple-Rendition Publications 1.0][multi-rendition]
160
+ * Add `EPUB::Book::Features#packages` and `#default_rendition`
161
+ * [BUG FIX]Don't raise error when using `Zipruby` container adapter
162
+
163
+ [multi-rendition]: http://www.idpf.org/epub/renditions/multiple/
164
+
154
165
  ### 0.2.5
155
166
 
156
- * [BUG FIX]RaiseDon't load Zip/Ruby if unneccessary
167
+ * [BUG FIX]Don't load Zip/Ruby if unneccessary
168
+ * Add `EPUB::CFI::PhysicalContainer.find_adapter`
157
169
 
158
170
  ### 0.2.4
159
171
 
@@ -163,38 +175,11 @@ RECENT CHANGES
163
175
  * Change attribute name: `EPUB::CFI::Step#step` -> `EPUB::CFI::Step#value`, `EPUB::CFI::CharacterOffset#offset` -> `EPUB::CFI::CharacterOffset#value`
164
176
  * Show modified on `epubinfo` command
165
177
 
166
- ### 0.2.3
167
-
168
- * Change the name of physical container adapter for file system: :File -> :UnpackedDirectory
169
- * Add `EPUB::Publication::Package::Manifest::Item#full_path`
170
- * Make #href= acceptable String
171
- * Implement `EPUB::CFI` and `EPUB::Parser::CFI`
172
- * Remove [nokogumbo][] from dependencies. It ommits `head` and `body` elements
173
- * Remove Cucumber and Cucumber features
174
- * Add `EPUB::Publication::Package::Metadata#modified` and `EPUB::Book::Features#modified`
175
- * Add `EPUB::Book::Features#release_identifier`
176
-
177
- [nokogumbo]: https://github.com/rubys/nokogumbo/
178
-
179
- ### 0.2.2
180
-
181
- * [BUGFIX]Item#entry_name returns normalized IRI
182
-
183
- ### 0.2.1
184
-
185
- * Remove deprecated `EPUB::Constants::MediaType::UnsupportedError`. Use `UnsupportedMediatType` instead.
186
- * Make it possible to use [archive-zip][] gem to extract contents from EPUB package
187
- * Add warning about default physical container adapter change
188
- * Make it possible to extract contents from the web via `EPUB::OCF::PhysicalContainer::UnpackedURI` See {file:ExtractContentsFromWeb.markdown} for details.
189
-
190
- [archive-zip]: https://github.com/javanthropus/archive-zip
191
-
192
178
  See {file:CHANGELOG.markdown} for older changelogs and details.
193
179
 
194
180
  TODOS
195
181
  -----
196
182
  * EPUB 3.0.1
197
- * Multiple rootfiles
198
183
  * Help features for `epub-open` tool
199
184
  * Vocabulary Association Mechanisms
200
185
  * Implementing navigation document and so on
@@ -215,6 +200,7 @@ DONE
215
200
  * Vocabulary Association Mechanisms(only for itemref)
216
201
  * Archive library abstraction
217
202
  * Extracting and organizing common behavior from some classes to modules
203
+ * Multiple rootfiles
218
204
 
219
205
  LICENSE
220
206
  -------
data/Rakefile CHANGED
@@ -1,6 +1,6 @@
1
- require 'bundler/gem_helper'
2
1
  require 'rake/clean'
3
2
  require 'rake/testtask'
3
+ require 'rubygems/tasks'
4
4
  require 'yard'
5
5
  require 'rdoc/task'
6
6
  require 'epub/parser/version'
@@ -54,7 +54,42 @@ namespace :doc do
54
54
  end
55
55
  end
56
56
 
57
- namespace :gem do
58
- Bundler::GemHelper.install_tasks
59
- task :build => [:clean, CFI_TAB]
57
+ Gem::Tasks.new do |tasks|
58
+ tasks.console.command = 'pry'
59
+ end
60
+ task :build => [:clean, CFI_TAB]
61
+
62
+ class ForwardableDefDelegatorsHandler < YARD::Handlers::Ruby::Base
63
+ handles method_call(:def_delegators)
64
+ namespace_only
65
+
66
+ def process
67
+ params = validated_attribute_names(statement.parameters(false))
68
+ accessor = params.shift
69
+ params.each do |param|
70
+ object = YARD::CodeObjects::MethodObject.new(namespace, param)
71
+ object.docstring = "Forwarded to +#{accessor}+"
72
+ end
73
+ end
74
+
75
+ protected
76
+
77
+ # Strips out any non-essential arguments from the attr statement.
78
+ #
79
+ # @param [Array<Parser::Ruby::AstNode>] params a list of the parameters
80
+ # in the attr call.
81
+ # @return [Array<String>] the validated attribute names
82
+ # @raise [Parser::UndocumentableError] if the arguments are not valid.
83
+ def validated_attribute_names(params)
84
+ params.map do |obj|
85
+ case obj.type
86
+ when :symbol_literal
87
+ obj.jump(:ident, :op, :kw, :const).source
88
+ when :string_literal
89
+ obj.jump(:string_content).source
90
+ else
91
+ raise YARD::Parser::UndocumentableError, obj.source
92
+ end
93
+ end
94
+ end
60
95
  end
@@ -15,7 +15,7 @@ Methods for {EPUB::Publication::Package}
15
15
 
16
16
  It is `true` when `package@prefix` attribute has `rendition` property.
17
17
 
18
- parser = EPUB::Parser::Publication.new(<<OPF, 'dummy/rootfile.opf')
18
+ parser = EPUB::Parser::Publication.new(<<OPF)
19
19
  <package version="3.0"
20
20
  unique-identifier="pub-id"
21
21
  xmlns="http://www.idpf.org/2007/opf"
data/docs/Item.markdown CHANGED
@@ -4,7 +4,7 @@ Overview
4
4
  ========
5
5
 
6
6
  When manipulating resources (XHTML, images, audio...) in EPUB, {EPUB::Publication::Package::Manifest::Item} object will be used.
7
- And objects which {EPUB#each_page_on_spine} yields are also instances of this class.
7
+ And objects which {EPUB::Book::Features#each_page_on_spine} yields are also instances of this class.
8
8
 
9
9
  Here's the tutorial of this class.
10
10
 
data/epub-parser.gemspec CHANGED
@@ -27,6 +27,7 @@ Gem::Specification.new do |s|
27
27
  s.has_rdoc = 'yard'
28
28
 
29
29
  s.add_development_dependency 'rake'
30
+ s.add_development_dependency 'rubygems-tasks'
30
31
  s.add_development_dependency 'zipruby'
31
32
  s.add_development_dependency 'pry'
32
33
  s.add_development_dependency 'pry-doc'
@@ -40,6 +41,7 @@ Gem::Specification.new do |s|
40
41
  s.add_development_dependency 'epzip'
41
42
  s.add_development_dependency 'racc'
42
43
  s.add_development_dependency 'nokogiri-diff'
44
+ s.add_development_dependency 'pretty_backtrace'
43
45
 
44
46
  s.add_runtime_dependency 'archive-zip'
45
47
  s.add_runtime_dependency 'nokogiri', '~> 1.6'
@@ -0,0 +1,111 @@
1
+ # coding: utf-8
2
+ # Preparation
3
+ #
4
+ # % cd examples
5
+ # % wget -O accessible-epub3.epub 'https://drive.google.com/uc?export=download&id=0B9g8D2Y-6aPLRmFKRTNIam93RTQ'
6
+ #
7
+ # Execution
8
+ #
9
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2/4'
10
+ # <p>Accessibility is a difficult concept to define. There’s no single magic bullet
11
+ # solution that will make all content accessible to all people. Perhaps that’s a
12
+ # strange way to preface a book on accessible practices, but it’s also a reality you
13
+ # need to be aware of. Accessible practices change, technologies evolve to solve
14
+ # stubborn problems, and the world becomes a more accessible place all the time.</p>
15
+ #
16
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2,/4,/8'
17
+ # <p>Accessibility is a difficult concept to define. There’s no single magic bullet
18
+ # solution that will make all content accessible to all people. Perhaps that’s a
19
+ # strange way to preface a book on accessible practices, but it’s also a reality you
20
+ # need to be aware of. Accessible practices change, technologies evolve to solve
21
+ # stubborn problems, and the world becomes a more accessible place all the time.</p>
22
+ # <p xmlns="http://www.w3.org/1999/xhtml">But although there are best practices that everyone should be following, and that
23
+ # will be detailed as we go along, this guide should neither be read as an instrument
24
+ # for accessibility compliance nor as a replacement for existing guidelines.</p>
25
+ # <p></p>
26
+ #
27
+ # Yes, output above shows a bug!
28
+ #
29
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2/4,:0,:47'
30
+ # Accessibility is a difficult concept to define.
31
+
32
+ require 'epub/parser'
33
+ require 'epub/parser/cfi'
34
+ require 'nokogiri' # Do gem install nokogiri
35
+ require 'nokogiri/xml/range' # Do gem install nokogiri-xml-range
36
+
37
+ def main(argv)
38
+ epub_path = argv.shift
39
+ cfi_string = argv.shift
40
+ if epub_path.nil? or cfi_string.nil?
41
+ $stderr.puts "USAGE: ruby #{$0} EPUB CFI"
42
+ abort
43
+ end
44
+
45
+ epub = EPUB::Parser.parse(epub_path)
46
+ cfi = EPUB::CFI(cfi_string)
47
+
48
+ content = extract_content(epub, cfi)
49
+ case content
50
+ when Nokogiri::XML::Element
51
+ puts content
52
+ when Nokogiri::XML::Range
53
+ puts content.clone_contents
54
+ end
55
+ end
56
+
57
+ def extract_content(epub, cfi)
58
+ if cfi.kind_of? EPUB::CFI::Location
59
+ node = get_element(cfi, epub)
60
+ offset = cfi.paths.last.offset
61
+ offset = offset.value if offset
62
+ # Maybe offset may not be used
63
+ return node
64
+ end
65
+
66
+ start_node = get_element(cfi.first, epub)
67
+ # Need more consideration
68
+ start_node = start_node.children.first if start_node.element?
69
+
70
+ end_node = get_element(cfi.last, epub)
71
+ # Need more consideration
72
+ end_node = end_node.children.last if end_node.element?
73
+
74
+ start_offset = cfi.first.paths.last.offset
75
+ start_offset = start_offset ? start_offset.value : 0
76
+ end_offset = cfi.last.paths.last.offset
77
+ end_offset = end_offset ? end_offset.value : 0
78
+
79
+ range = Nokogiri::XML::Range.new(start_node, start_offset, end_node, end_offset)
80
+
81
+ return range
82
+ end
83
+
84
+ def get_element(cfi, epub)
85
+ path_in_package = cfi.paths.first
86
+ step_to_itemref = path_in_package.steps[1]
87
+ itemref = epub.spine.itemrefs[step_to_itemref.step / 2 - 1]
88
+
89
+ doc = itemref.item.content_document.nokogiri
90
+ path_in_doc = cfi.paths[1]
91
+ current_node = doc.root
92
+ path_in_doc.steps.each do |step|
93
+ if step.element?
94
+ current_node = current_node.element_children[step.value / 2 - 1]
95
+ else
96
+ element_index = (step.value - 1) / 2 - 1
97
+ if element_index == -1
98
+ current_node = current_node.children.first
99
+ else
100
+ prev = current_node.element_children[element_index]
101
+ break unless prev
102
+ current_node = prev.next_sibling
103
+ break unless current_node
104
+ end
105
+ end
106
+ end
107
+
108
+ current_node
109
+ end
110
+
111
+ main(ARGV)
@@ -0,0 +1,54 @@
1
+ require 'English'
2
+ require 'epub/parser'
3
+ require 'epub/parser/cfi'
4
+ require 'nokogiri'
5
+
6
+ def usage
7
+ <<EOS
8
+
9
+ USAGE:
10
+ ruby #{$PROGRAM_NAME} ELEMENT EPUB
11
+
12
+ EOS
13
+ end
14
+
15
+ def main(argv)
16
+ elem_name = argv.shift
17
+ epub_path = argv.shift
18
+ if elem_name.nil? or epub_path.nil?
19
+ abort usage
20
+ end
21
+
22
+ spine_step = EPUB::CFI::Step.new(6)
23
+
24
+ epub = EPUB::Parser.parse(epub_path)
25
+ epub.package.spine.each_itemref.with_index do |itemref, i|
26
+ itemref_step = {
27
+ :step => (i + 1) * 2,
28
+ :id => itemref.id
29
+ }
30
+ assertion = itemref.id ? EPUB::CFI::IDAssertion.new(itemref.id) : nil
31
+ itemref_step = EPUB::CFI::Step.new((i + 1) * 2, assertion)
32
+ path_to_itemref = EPUB::CFI::Path.new([spine_step, itemref_step])
33
+ itemref.item.content_document.nokogiri.search(elem_name).each do |elem|
34
+ path = find_path(elem)
35
+ location = EPUB::CFI::Location.new([path_to_itemref, path])
36
+ puts
37
+ puts location
38
+ puts elem
39
+ end
40
+ end
41
+ end
42
+
43
+ def find_path(elem)
44
+ steps = []
45
+ until elem.parent.document?
46
+ index = elem.parent.element_children.index(elem)
47
+ assertion = elem["id"] ? EPUB::CFI::IDAssertion.new(elem["id"]) : nil
48
+ steps.unshift EPUB::CFI::Step.new((index + 1) * 2, assertion)
49
+ elem = elem.parent
50
+ end
51
+ EPUB::CFI::Path.new(steps)
52
+ end
53
+
54
+ main ARGV
@@ -1,45 +1,57 @@
1
+ require 'forwardable'
2
+
1
3
  module EPUB
2
4
  class Book
3
5
  module Features
4
- modules = [:ocf, :package]
5
- attr_reader *modules
6
+ extend Forwardable
7
+ attr_reader :ocf
8
+ attr_writer :package
6
9
  attr_accessor :epub_file
7
- modules.each do |mod|
8
- define_method "#{mod}=" do |obj|
9
- instance_variable_set "@#{mod}", obj
10
- obj.book = self
11
- end
10
+
11
+ # When writing, sets +ocf.book+ to self.
12
+ # @param [OCF]
13
+ def ocf=(mod)
14
+ @ocf = mod
15
+ mod.book = self
16
+ mod
12
17
  end
13
18
 
14
- Publication::Package::CONTENT_MODELS.each do |model|
15
- define_method model do
16
- package.__send__(model)
17
- end
19
+ # @return [Array<OCF::Container::Rootfile>]
20
+ def rootfiles
21
+ ocf.container.rootfiles
18
22
  end
19
23
 
20
- %w[title main_title subtitle short_title collection_title edition_title extended_title description date unique_identifier modified].each do |met|
21
- define_method met do
22
- metadata.__send__(met)
23
- end
24
+ # @return [Array<Publication::Package>]
25
+ def packages
26
+ rootfiles.map(&:package)
24
27
  end
28
+ alias renditions packages
25
29
 
26
- %w[nav].each do |met|
27
- define_method met do
28
- manifest.__send__(met)
29
- end
30
+ # Syntax sugar.
31
+ # Returns package set by +package=+.
32
+ # Returns default rendition if any package has not been set ever.
33
+ # @return [Publication::Package]
34
+ def package
35
+ @package || default_rendition
30
36
  end
31
37
 
32
- def release_identifier
33
- "#{unique_identifier}@#{modified}"
38
+ # First +package+ in +packages+
39
+ # @return [Package|nil]
40
+ def default_rendition
41
+ packages.first
34
42
  end
35
43
 
44
+ # @!parse def_delegators :package, :metadata, :manifest, :spine, :guide, :bindings
45
+ def_delegators :package, *Publication::Package::CONTENT_MODELS
46
+ def_delegators :metadata, :title, :main_title, :subtitle, :short_title, :collection_title, :edition_title, :extended_title, :description, :date, :unique_identifier, :modified, :release_identifier, :package_identifier
47
+ def_delegators :manifest, :nav, :cover_image
48
+
36
49
  def container_adapter
37
50
  @adapter || OCF::PhysicalContainer.adapter
38
51
  end
39
52
 
40
53
  def container_adapter=(adapter)
41
- @adapter = adapter.instance_of?(Class) ? adapter : OCF::PhysicalContainer.const_get(adapter)
42
- adapter
54
+ @adapter = OCF::PhysicalContainer.find_adapter(adapter)
43
55
  end
44
56
 
45
57
  # @overload each_page_on_spine(&blk)
@@ -84,15 +96,10 @@ module EPUB
84
96
  end
85
97
 
86
98
  # Syntax sugar
99
+ # @return String
87
100
  def rootfile_path
88
101
  ocf.container.rootfile.full_path.to_s
89
102
  end
90
-
91
- # Syntax sugar
92
- def cover_image
93
- manifest.cover_image
94
- end
95
-
96
103
  end
97
104
  end
98
105
  end