epub-parser 0.2.5 → 0.2.6

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ccf4841189745cdaa96e82b64a3493e4a3dc2421
4
- data.tar.gz: e2c83bc88d36a63957b313136372568506377ae3
3
+ metadata.gz: 7b9ee0e48ab95b16264f66048d2f0d040f63ab2d
4
+ data.tar.gz: b16bd608fc1c7c54de24a30c13ad880cd9f096de
5
5
  SHA512:
6
- metadata.gz: e635c138b09d454c17fd8f93b59b04403e828c73a4a0be14dc7d133501bc219a6d41a7d1108091b4bc845276ab33e7905d5ba7caed9403904cb6a71058c59dcf
7
- data.tar.gz: 12fe31458bfcfd89e8dd6fee555473f02370c613893ee681b2a7d9b6d2db70640bdb1403140300057016c4595976d678509ab812de070e25b63b8e6dac17ee70
6
+ metadata.gz: 6df7c1519d379afe93635e085ae7c3d9548ac9005a7e6cb79066db2310deca63ac8d0089100516a3657e6a16f092c3472897e5b10c227dd60b0c5db5e028f570
7
+ data.tar.gz: 80f34c61d4043beba1ca680db124744a1c619c126e19b097d2b1ba4ac0cb0439efdb99ebd9c183119f8d5a210bfde7408f234b4bf25678a36b4401cf842f37a6
data/.travis.yml CHANGED
@@ -1,4 +1,4 @@
1
1
  rvm:
2
- - "2.1.8"
3
- - "2.2.4"
4
- - "2.3.0"
2
+ - "2.1.10"
3
+ - "2.2.5"
4
+ - "2.3.1"
data/.yardopts CHANGED
@@ -12,3 +12,5 @@ docs/Searcher.markdown
12
12
  docs/UnpackedArchive.markdown
13
13
  docs/AggregateContentsFromWeb.markdown
14
14
  examples/aggregate-contents-from-web.rb
15
+ examples/exctract-content-using-cfi.rb
16
+ examples/find-elements-and-cfis.rb
data/CHANGELOG.markdown CHANGED
@@ -1,12 +1,25 @@
1
1
  CHANGELOG
2
2
  =========
3
3
 
4
+ 0.2.6
5
+ -----
6
+
7
+ * Add `EPUB::Publication::Package::Metadata#package_identifier` as alias of `#release_identifier`, which is defined in EPUB Publication 3.0 spec
8
+ * [BUG FIX]Metadata#modified returns modified with no refiners
9
+ * Make second argument for `EPUB::Parser::Publication.new` deprecated
10
+ * Add META-INF/metadata.xml support defined in [EPUB Multiple-Rendition Publications 1.0][multi-rendition]
11
+ * Add `EPUB::Book::Features#packages` and `#default_rendition`
12
+ * [BUG FIX]Don't raise error when using `Zipruby` container adapter
13
+
14
+ [multi-rendition]: http://www.idpf.org/epub/renditions/multiple/
15
+
4
16
  0.2.5
5
17
  -----
6
18
 
7
- * [BUG FIX]RaiseDon't load Zip/Ruby if unneccessary
19
+ * [BUG FIX]Don't load Zip/Ruby if unneccessary
8
20
  * Raise error when PhysicalContainer::ArchiveZip fails find entry
9
21
  * Remove unused files in schemas directory
22
+ * Add `EPUB::CFI::PhysicalContainer.find_adapter`
10
23
 
11
24
  0.2.4
12
25
  -----
data/README.markdown CHANGED
@@ -135,7 +135,7 @@ REQUIREMENTS
135
135
  ------------
136
136
  * Ruby 2.1.0 or later
137
137
  * `patch` command to install Nokogiri
138
- * C compiler to compile Zip/Ruby and Nokogiri
138
+ * C compiler to compile Nokogiri
139
139
 
140
140
  Related Gems
141
141
  ------------
@@ -151,9 +151,21 @@ If you find other gems, please tell me or request a pull request.
151
151
  RECENT CHANGES
152
152
  --------------
153
153
 
154
+ ### 0.2.6
155
+
156
+ * Add `EPUB::Publication::Package::Metadata#package_identifier` as alias of `#release_identifier`
157
+ * [BUG FIX]Metadata#modified returns modified with no refiners
158
+ * Make second argument for `EPUB::Parser::Publication.new` deprecated
159
+ * Add META-INF/metadata.xml support defined in [EPUB Multiple-Rendition Publications 1.0][multi-rendition]
160
+ * Add `EPUB::Book::Features#packages` and `#default_rendition`
161
+ * [BUG FIX]Don't raise error when using `Zipruby` container adapter
162
+
163
+ [multi-rendition]: http://www.idpf.org/epub/renditions/multiple/
164
+
154
165
  ### 0.2.5
155
166
 
156
- * [BUG FIX]RaiseDon't load Zip/Ruby if unneccessary
167
+ * [BUG FIX]Don't load Zip/Ruby if unneccessary
168
+ * Add `EPUB::CFI::PhysicalContainer.find_adapter`
157
169
 
158
170
  ### 0.2.4
159
171
 
@@ -163,38 +175,11 @@ RECENT CHANGES
163
175
  * Change attribute name: `EPUB::CFI::Step#step` -> `EPUB::CFI::Step#value`, `EPUB::CFI::CharacterOffset#offset` -> `EPUB::CFI::CharacterOffset#value`
164
176
  * Show modified on `epubinfo` command
165
177
 
166
- ### 0.2.3
167
-
168
- * Change the name of physical container adapter for file system: :File -> :UnpackedDirectory
169
- * Add `EPUB::Publication::Package::Manifest::Item#full_path`
170
- * Make #href= acceptable String
171
- * Implement `EPUB::CFI` and `EPUB::Parser::CFI`
172
- * Remove [nokogumbo][] from dependencies. It ommits `head` and `body` elements
173
- * Remove Cucumber and Cucumber features
174
- * Add `EPUB::Publication::Package::Metadata#modified` and `EPUB::Book::Features#modified`
175
- * Add `EPUB::Book::Features#release_identifier`
176
-
177
- [nokogumbo]: https://github.com/rubys/nokogumbo/
178
-
179
- ### 0.2.2
180
-
181
- * [BUGFIX]Item#entry_name returns normalized IRI
182
-
183
- ### 0.2.1
184
-
185
- * Remove deprecated `EPUB::Constants::MediaType::UnsupportedError`. Use `UnsupportedMediatType` instead.
186
- * Make it possible to use [archive-zip][] gem to extract contents from EPUB package
187
- * Add warning about default physical container adapter change
188
- * Make it possible to extract contents from the web via `EPUB::OCF::PhysicalContainer::UnpackedURI` See {file:ExtractContentsFromWeb.markdown} for details.
189
-
190
- [archive-zip]: https://github.com/javanthropus/archive-zip
191
-
192
178
  See {file:CHANGELOG.markdown} for older changelogs and details.
193
179
 
194
180
  TODOS
195
181
  -----
196
182
  * EPUB 3.0.1
197
- * Multiple rootfiles
198
183
  * Help features for `epub-open` tool
199
184
  * Vocabulary Association Mechanisms
200
185
  * Implementing navigation document and so on
@@ -215,6 +200,7 @@ DONE
215
200
  * Vocabulary Association Mechanisms(only for itemref)
216
201
  * Archive library abstraction
217
202
  * Extracting and organizing common behavior from some classes to modules
203
+ * Multiple rootfiles
218
204
 
219
205
  LICENSE
220
206
  -------
data/Rakefile CHANGED
@@ -1,6 +1,6 @@
1
- require 'bundler/gem_helper'
2
1
  require 'rake/clean'
3
2
  require 'rake/testtask'
3
+ require 'rubygems/tasks'
4
4
  require 'yard'
5
5
  require 'rdoc/task'
6
6
  require 'epub/parser/version'
@@ -54,7 +54,42 @@ namespace :doc do
54
54
  end
55
55
  end
56
56
 
57
- namespace :gem do
58
- Bundler::GemHelper.install_tasks
59
- task :build => [:clean, CFI_TAB]
57
+ Gem::Tasks.new do |tasks|
58
+ tasks.console.command = 'pry'
59
+ end
60
+ task :build => [:clean, CFI_TAB]
61
+
62
+ class ForwardableDefDelegatorsHandler < YARD::Handlers::Ruby::Base
63
+ handles method_call(:def_delegators)
64
+ namespace_only
65
+
66
+ def process
67
+ params = validated_attribute_names(statement.parameters(false))
68
+ accessor = params.shift
69
+ params.each do |param|
70
+ object = YARD::CodeObjects::MethodObject.new(namespace, param)
71
+ object.docstring = "Forwarded to +#{accessor}+"
72
+ end
73
+ end
74
+
75
+ protected
76
+
77
+ # Strips out any non-essential arguments from the attr statement.
78
+ #
79
+ # @param [Array<Parser::Ruby::AstNode>] params a list of the parameters
80
+ # in the attr call.
81
+ # @return [Array<String>] the validated attribute names
82
+ # @raise [Parser::UndocumentableError] if the arguments are not valid.
83
+ def validated_attribute_names(params)
84
+ params.map do |obj|
85
+ case obj.type
86
+ when :symbol_literal
87
+ obj.jump(:ident, :op, :kw, :const).source
88
+ when :string_literal
89
+ obj.jump(:string_content).source
90
+ else
91
+ raise YARD::Parser::UndocumentableError, obj.source
92
+ end
93
+ end
94
+ end
60
95
  end
@@ -15,7 +15,7 @@ Methods for {EPUB::Publication::Package}
15
15
 
16
16
  It is `true` when `package@prefix` attribute has `rendition` property.
17
17
 
18
- parser = EPUB::Parser::Publication.new(<<OPF, 'dummy/rootfile.opf')
18
+ parser = EPUB::Parser::Publication.new(<<OPF)
19
19
  <package version="3.0"
20
20
  unique-identifier="pub-id"
21
21
  xmlns="http://www.idpf.org/2007/opf"
data/docs/Item.markdown CHANGED
@@ -4,7 +4,7 @@ Overview
4
4
  ========
5
5
 
6
6
  When manipulating resources (XHTML, images, audio...) in EPUB, {EPUB::Publication::Package::Manifest::Item} object will be used.
7
- And objects which {EPUB#each_page_on_spine} yields are also instances of this class.
7
+ And objects which {EPUB::Book::Features#each_page_on_spine} yields are also instances of this class.
8
8
 
9
9
  Here's the tutorial of this class.
10
10
 
data/epub-parser.gemspec CHANGED
@@ -27,6 +27,7 @@ Gem::Specification.new do |s|
27
27
  s.has_rdoc = 'yard'
28
28
 
29
29
  s.add_development_dependency 'rake'
30
+ s.add_development_dependency 'rubygems-tasks'
30
31
  s.add_development_dependency 'zipruby'
31
32
  s.add_development_dependency 'pry'
32
33
  s.add_development_dependency 'pry-doc'
@@ -40,6 +41,7 @@ Gem::Specification.new do |s|
40
41
  s.add_development_dependency 'epzip'
41
42
  s.add_development_dependency 'racc'
42
43
  s.add_development_dependency 'nokogiri-diff'
44
+ s.add_development_dependency 'pretty_backtrace'
43
45
 
44
46
  s.add_runtime_dependency 'archive-zip'
45
47
  s.add_runtime_dependency 'nokogiri', '~> 1.6'
@@ -0,0 +1,111 @@
1
+ # coding: utf-8
2
+ # Preparation
3
+ #
4
+ # % cd examples
5
+ # % wget -O accessible-epub3.epub 'https://drive.google.com/uc?export=download&id=0B9g8D2Y-6aPLRmFKRTNIam93RTQ'
6
+ #
7
+ # Execution
8
+ #
9
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2/4'
10
+ # <p>Accessibility is a difficult concept to define. There’s no single magic bullet
11
+ # solution that will make all content accessible to all people. Perhaps that’s a
12
+ # strange way to preface a book on accessible practices, but it’s also a reality you
13
+ # need to be aware of. Accessible practices change, technologies evolve to solve
14
+ # stubborn problems, and the world becomes a more accessible place all the time.</p>
15
+ #
16
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2,/4,/8'
17
+ # <p>Accessibility is a difficult concept to define. There’s no single magic bullet
18
+ # solution that will make all content accessible to all people. Perhaps that’s a
19
+ # strange way to preface a book on accessible practices, but it’s also a reality you
20
+ # need to be aware of. Accessible practices change, technologies evolve to solve
21
+ # stubborn problems, and the world becomes a more accessible place all the time.</p>
22
+ # <p xmlns="http://www.w3.org/1999/xhtml">But although there are best practices that everyone should be following, and that
23
+ # will be detailed as we go along, this guide should neither be read as an instrument
24
+ # for accessibility compliance nor as a replacement for existing guidelines.</p>
25
+ # <p></p>
26
+ #
27
+ # Yes, output above shows a bug!
28
+ #
29
+ # % ruby exctract-content-using-cfi.rb accessible-epub3.epub '/6/10!/4/2/4,:0,:47'
30
+ # Accessibility is a difficult concept to define.
31
+
32
+ require 'epub/parser'
33
+ require 'epub/parser/cfi'
34
+ require 'nokogiri' # Do gem install nokogiri
35
+ require 'nokogiri/xml/range' # Do gem install nokogiri-xml-range
36
+
37
+ def main(argv)
38
+ epub_path = argv.shift
39
+ cfi_string = argv.shift
40
+ if epub_path.nil? or cfi_string.nil?
41
+ $stderr.puts "USAGE: ruby #{$0} EPUB CFI"
42
+ abort
43
+ end
44
+
45
+ epub = EPUB::Parser.parse(epub_path)
46
+ cfi = EPUB::CFI(cfi_string)
47
+
48
+ content = extract_content(epub, cfi)
49
+ case content
50
+ when Nokogiri::XML::Element
51
+ puts content
52
+ when Nokogiri::XML::Range
53
+ puts content.clone_contents
54
+ end
55
+ end
56
+
57
+ def extract_content(epub, cfi)
58
+ if cfi.kind_of? EPUB::CFI::Location
59
+ node = get_element(cfi, epub)
60
+ offset = cfi.paths.last.offset
61
+ offset = offset.value if offset
62
+ # Maybe offset may not be used
63
+ return node
64
+ end
65
+
66
+ start_node = get_element(cfi.first, epub)
67
+ # Need more consideration
68
+ start_node = start_node.children.first if start_node.element?
69
+
70
+ end_node = get_element(cfi.last, epub)
71
+ # Need more consideration
72
+ end_node = end_node.children.last if end_node.element?
73
+
74
+ start_offset = cfi.first.paths.last.offset
75
+ start_offset = start_offset ? start_offset.value : 0
76
+ end_offset = cfi.last.paths.last.offset
77
+ end_offset = end_offset ? end_offset.value : 0
78
+
79
+ range = Nokogiri::XML::Range.new(start_node, start_offset, end_node, end_offset)
80
+
81
+ return range
82
+ end
83
+
84
+ def get_element(cfi, epub)
85
+ path_in_package = cfi.paths.first
86
+ step_to_itemref = path_in_package.steps[1]
87
+ itemref = epub.spine.itemrefs[step_to_itemref.step / 2 - 1]
88
+
89
+ doc = itemref.item.content_document.nokogiri
90
+ path_in_doc = cfi.paths[1]
91
+ current_node = doc.root
92
+ path_in_doc.steps.each do |step|
93
+ if step.element?
94
+ current_node = current_node.element_children[step.value / 2 - 1]
95
+ else
96
+ element_index = (step.value - 1) / 2 - 1
97
+ if element_index == -1
98
+ current_node = current_node.children.first
99
+ else
100
+ prev = current_node.element_children[element_index]
101
+ break unless prev
102
+ current_node = prev.next_sibling
103
+ break unless current_node
104
+ end
105
+ end
106
+ end
107
+
108
+ current_node
109
+ end
110
+
111
+ main(ARGV)
@@ -0,0 +1,54 @@
1
+ require 'English'
2
+ require 'epub/parser'
3
+ require 'epub/parser/cfi'
4
+ require 'nokogiri'
5
+
6
+ def usage
7
+ <<EOS
8
+
9
+ USAGE:
10
+ ruby #{$PROGRAM_NAME} ELEMENT EPUB
11
+
12
+ EOS
13
+ end
14
+
15
+ def main(argv)
16
+ elem_name = argv.shift
17
+ epub_path = argv.shift
18
+ if elem_name.nil? or epub_path.nil?
19
+ abort usage
20
+ end
21
+
22
+ spine_step = EPUB::CFI::Step.new(6)
23
+
24
+ epub = EPUB::Parser.parse(epub_path)
25
+ epub.package.spine.each_itemref.with_index do |itemref, i|
26
+ itemref_step = {
27
+ :step => (i + 1) * 2,
28
+ :id => itemref.id
29
+ }
30
+ assertion = itemref.id ? EPUB::CFI::IDAssertion.new(itemref.id) : nil
31
+ itemref_step = EPUB::CFI::Step.new((i + 1) * 2, assertion)
32
+ path_to_itemref = EPUB::CFI::Path.new([spine_step, itemref_step])
33
+ itemref.item.content_document.nokogiri.search(elem_name).each do |elem|
34
+ path = find_path(elem)
35
+ location = EPUB::CFI::Location.new([path_to_itemref, path])
36
+ puts
37
+ puts location
38
+ puts elem
39
+ end
40
+ end
41
+ end
42
+
43
+ def find_path(elem)
44
+ steps = []
45
+ until elem.parent.document?
46
+ index = elem.parent.element_children.index(elem)
47
+ assertion = elem["id"] ? EPUB::CFI::IDAssertion.new(elem["id"]) : nil
48
+ steps.unshift EPUB::CFI::Step.new((index + 1) * 2, assertion)
49
+ elem = elem.parent
50
+ end
51
+ EPUB::CFI::Path.new(steps)
52
+ end
53
+
54
+ main ARGV
@@ -1,45 +1,57 @@
1
+ require 'forwardable'
2
+
1
3
  module EPUB
2
4
  class Book
3
5
  module Features
4
- modules = [:ocf, :package]
5
- attr_reader *modules
6
+ extend Forwardable
7
+ attr_reader :ocf
8
+ attr_writer :package
6
9
  attr_accessor :epub_file
7
- modules.each do |mod|
8
- define_method "#{mod}=" do |obj|
9
- instance_variable_set "@#{mod}", obj
10
- obj.book = self
11
- end
10
+
11
+ # When writing, sets +ocf.book+ to self.
12
+ # @param [OCF]
13
+ def ocf=(mod)
14
+ @ocf = mod
15
+ mod.book = self
16
+ mod
12
17
  end
13
18
 
14
- Publication::Package::CONTENT_MODELS.each do |model|
15
- define_method model do
16
- package.__send__(model)
17
- end
19
+ # @return [Array<OCF::Container::Rootfile>]
20
+ def rootfiles
21
+ ocf.container.rootfiles
18
22
  end
19
23
 
20
- %w[title main_title subtitle short_title collection_title edition_title extended_title description date unique_identifier modified].each do |met|
21
- define_method met do
22
- metadata.__send__(met)
23
- end
24
+ # @return [Array<Publication::Package>]
25
+ def packages
26
+ rootfiles.map(&:package)
24
27
  end
28
+ alias renditions packages
25
29
 
26
- %w[nav].each do |met|
27
- define_method met do
28
- manifest.__send__(met)
29
- end
30
+ # Syntax sugar.
31
+ # Returns package set by +package=+.
32
+ # Returns default rendition if any package has not been set ever.
33
+ # @return [Publication::Package]
34
+ def package
35
+ @package || default_rendition
30
36
  end
31
37
 
32
- def release_identifier
33
- "#{unique_identifier}@#{modified}"
38
+ # First +package+ in +packages+
39
+ # @return [Package|nil]
40
+ def default_rendition
41
+ packages.first
34
42
  end
35
43
 
44
+ # @!parse def_delegators :package, :metadata, :manifest, :spine, :guide, :bindings
45
+ def_delegators :package, *Publication::Package::CONTENT_MODELS
46
+ def_delegators :metadata, :title, :main_title, :subtitle, :short_title, :collection_title, :edition_title, :extended_title, :description, :date, :unique_identifier, :modified, :release_identifier, :package_identifier
47
+ def_delegators :manifest, :nav, :cover_image
48
+
36
49
  def container_adapter
37
50
  @adapter || OCF::PhysicalContainer.adapter
38
51
  end
39
52
 
40
53
  def container_adapter=(adapter)
41
- @adapter = adapter.instance_of?(Class) ? adapter : OCF::PhysicalContainer.const_get(adapter)
42
- adapter
54
+ @adapter = OCF::PhysicalContainer.find_adapter(adapter)
43
55
  end
44
56
 
45
57
  # @overload each_page_on_spine(&blk)
@@ -84,15 +96,10 @@ module EPUB
84
96
  end
85
97
 
86
98
  # Syntax sugar
99
+ # @return String
87
100
  def rootfile_path
88
101
  ocf.container.rootfile.full_path.to_s
89
102
  end
90
-
91
- # Syntax sugar
92
- def cover_image
93
- manifest.cover_image
94
- end
95
-
96
103
  end
97
104
  end
98
105
  end