epub-parser 0.2.6 → 0.2.7
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.yardopts +1 -0
- data/CHANGELOG.markdown +8 -0
- data/README.markdown +7 -8
- data/docs/Home.markdown +3 -0
- data/docs/MultipleRenditions.markdown +61 -0
- data/lib/epub/book/features.rb +1 -9
- data/lib/epub/metadata.rb +11 -0
- data/lib/epub/parser.rb +5 -1
- data/lib/epub/parser/metadata.rb +77 -44
- data/lib/epub/parser/version.rb +1 -1
- data/test/test_parser_publication.rb +9 -0
- metadata +3 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: e92165c76652996a441e9996bb9bdee8bb0e7b04
|
4
|
+
data.tar.gz: c7a62b70d282f9b8343c0b850db21d7f078961d9
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f56160b8148faf112e6ff8941a380223f1ce4d0ad114b2375f8e9793fe4eb0d7d9e18dda540f36c1ce702c90c7791006f5fa65c270498d5d6e5b341862e6aa37
|
7
|
+
data.tar.gz: f139ae34069e6bcfe85d23ae5b4db32cecd99ae90cfc1fbe12ba1b2541f31d93329a6215cf099074b0fc0a3b03996ea6fca2508ad9b3165f27522be5fb8c759e
|
data/.yardopts
CHANGED
@@ -11,6 +11,7 @@ docs/Navigation.markdown
|
|
11
11
|
docs/Searcher.markdown
|
12
12
|
docs/UnpackedArchive.markdown
|
13
13
|
docs/AggregateContentsFromWeb.markdown
|
14
|
+
docs/MultipleRenditions.markdown
|
14
15
|
examples/aggregate-contents-from-web.rb
|
15
16
|
examples/exctract-content-using-cfi.rb
|
16
17
|
examples/find-elements-and-cfis.rb
|
data/CHANGELOG.markdown
CHANGED
@@ -1,6 +1,14 @@
|
|
1
1
|
CHANGELOG
|
2
2
|
=========
|
3
3
|
|
4
|
+
0.2.7
|
5
|
+
-----
|
6
|
+
|
7
|
+
* Add `EPUB::Metadata#children` to keep all child emements to count them on CFI search
|
8
|
+
* Allow class including `EPUB` to intialize with extra arguments(Thanks, [skukx][]!)
|
9
|
+
|
10
|
+
[skukx]: https://github.com/skukx
|
11
|
+
|
4
12
|
0.2.6
|
5
13
|
-----
|
6
14
|
|
data/README.markdown
CHANGED
@@ -151,6 +151,13 @@ If you find other gems, please tell me or request a pull request.
|
|
151
151
|
RECENT CHANGES
|
152
152
|
--------------
|
153
153
|
|
154
|
+
### 0.2.7
|
155
|
+
|
156
|
+
* Add `EPUB::Metadata#children`
|
157
|
+
* Allow class including `EPUB` to intialize with extra arguments(Thanks, [skukx][]!)
|
158
|
+
|
159
|
+
[skukx]: https://github.com/skukx
|
160
|
+
|
154
161
|
### 0.2.6
|
155
162
|
|
156
163
|
* Add `EPUB::Publication::Package::Metadata#package_identifier` as alias of `#release_identifier`
|
@@ -167,14 +174,6 @@ RECENT CHANGES
|
|
167
174
|
* [BUG FIX]Don't load Zip/Ruby if unneccessary
|
168
175
|
* Add `EPUB::CFI::PhysicalContainer.find_adapter`
|
169
176
|
|
170
|
-
### 0.2.4
|
171
|
-
|
172
|
-
* Bug fix for `EPUB::CFI::Location#<=>`
|
173
|
-
* Change default physical container adapter from `EPUB::OCF::PhysicalContainer::ZipRuby` to `EPUB::OCF::PhysicalContainer::ArchiveZip`
|
174
|
-
* Add `EPUB::CFI::Step#element?` and `#character_data?`
|
175
|
-
* Change attribute name: `EPUB::CFI::Step#step` -> `EPUB::CFI::Step#value`, `EPUB::CFI::CharacterOffset#offset` -> `EPUB::CFI::CharacterOffset#value`
|
176
|
-
* Show modified on `epubinfo` command
|
177
|
-
|
178
177
|
See {file:CHANGELOG.markdown} for older changelogs and details.
|
179
178
|
|
180
179
|
TODOS
|
data/docs/Home.markdown
CHANGED
@@ -102,6 +102,7 @@ More documentations are avaiable in:
|
|
102
102
|
* {file:docs/Searcher.markdown}
|
103
103
|
* {file:docs/UnpackedArchive.markdown}
|
104
104
|
* {file:docs/AggregateContentsFromWeb.markdown}
|
105
|
+
* {file:docs/MultipleRenditions.markdown}
|
105
106
|
|
106
107
|
If you installed EPUB Parser via gem command, you can also generate documentaiton by your own([rubygems-yardoc][] gem is needed):
|
107
108
|
|
@@ -157,11 +158,13 @@ Currently implemented:
|
|
157
158
|
* [EPUB Publications 3.0][]
|
158
159
|
* EPUB Navigation Documents of [EPUB Content Documents 3.0][]
|
159
160
|
* [EPUB 3 Fixed-Layout Documents][]
|
161
|
+
* metadata.xml of [EPUB Multiple-Rendition Publications][]
|
160
162
|
|
161
163
|
[EPUB Open Container Format (OCF) 3.0]:http://idpf.org/epub/30/spec/epub30-ocf.html#sec-container-metainf-container.xml
|
162
164
|
[EPUB Publications 3.0]:http://idpf.org/epub/30/spec/epub30-publications.html
|
163
165
|
[EPUB Content Documents 3.0]:http://www.idpf.org/epub/30/spec/epub30-contentdocs.html
|
164
166
|
[EPUB 3 Fixed-Layout Documents]:http://www.idpf.org/epub/fxl/
|
167
|
+
[EPUB Multiple-Rendition Publications]: http://www.idpf.org/epub/renditions/multiple/
|
165
168
|
|
166
169
|
License
|
167
170
|
-------
|
@@ -0,0 +1,61 @@
|
|
1
|
+
{file:docs/Home.markdown} > **{file:docs/MultipleRenditions.markdown}**
|
2
|
+
|
3
|
+
Multiple Renditions
|
4
|
+
===================
|
5
|
+
|
6
|
+
An EPUB publication(file) may have multiple renditions, how reading system renders contents. It is expressed as multiple {EPUB::Publication::Package} object.
|
7
|
+
|
8
|
+
Usually, you don't need to care about it.
|
9
|
+
|
10
|
+
epub = EPUB::Parser.parse('path/to/book')
|
11
|
+
epub.package # => #<EPUB::Publication::Package...>
|
12
|
+
|
13
|
+
This is enough in most cases.
|
14
|
+
|
15
|
+
Getting multiple renditions
|
16
|
+
---------------------------
|
17
|
+
|
18
|
+
If your book has multiple renditions, you can get them by {EPUB::Book::Features#packages}, aliased as {EPUB::Book::Features#packages #renditoins}.
|
19
|
+
|
20
|
+
epub.packages # => [#<EPUB::Publication::Package...>, #<EPUB::Publication::Package...>, ...]
|
21
|
+
|
22
|
+
`epub.package` is shortcut to `epub.packages.first`(called default rendition).
|
23
|
+
|
24
|
+
epub.package == epub.packages.first # => true
|
25
|
+
epub.default_rendition == epub.packages.first # => true
|
26
|
+
|
27
|
+
Metadata of renditions
|
28
|
+
----------------------
|
29
|
+
|
30
|
+
Sometimes, the situation about metadata is more complicated.
|
31
|
+
|
32
|
+
A publication may have multiple rendition(package)s. A rendition has a metadata. So, A publication has as many metadata objects as renditions...
|
33
|
+
|
34
|
+
epub.packages.all? {|package| package.respond_to? :metadata} # => true
|
35
|
+
|
36
|
+
... at least.
|
37
|
+
|
38
|
+
In addition, a publication may have a metadata that is not related to any rendition but to EPUB file itself. You can access to it by:
|
39
|
+
|
40
|
+
epub.ocf.metadata # => #<EPUB::Metadata...>
|
41
|
+
|
42
|
+
This kind of metadata is introduced by [EPUB Multiple-Rendition Publications][] spec and most EPUB files don't have that for now. So, you need to note that `epub.ocf.metadata` might be `nil`.
|
43
|
+
|
44
|
+
Identifiers
|
45
|
+
-----------
|
46
|
+
|
47
|
+
You can identify any metadata by identifers called release identifier. If your book has metadata for publication and packages, they might not have the same identifier to any identifiers of renditions. By this difference, publishers or authors can represent the situation "all renditions are changed but the book itself is not." To do so, publishers or authors will change all identifiers of renditions(`epub.packages.collect(&:metadata).collect(&:release_identifier)`) but keep `epub.ocf.metadata.release_identifier`.
|
48
|
+
|
49
|
+
Known issues
|
50
|
+
------------
|
51
|
+
|
52
|
+
Currently, at least on v0.2.6, EPUB Parser provides limited support for [Multiple-Rendition][EPUB Multiple-Rendition Publications] spec.
|
53
|
+
|
54
|
+
See also
|
55
|
+
--------
|
56
|
+
|
57
|
+
* [EPUB Publications 3.0.1][]
|
58
|
+
* [EPUB Multiple-Rendition Publications][]
|
59
|
+
|
60
|
+
[EPUB Publications 3.0.1]: http://www.idpf.org/epub/301/spec/epub-publications.html
|
61
|
+
[EPUB Multiple-Rendition Publications]: http://www.idpf.org/epub/renditions/multiple/
|
data/lib/epub/book/features.rb
CHANGED
@@ -5,7 +5,6 @@ module EPUB
|
|
5
5
|
module Features
|
6
6
|
extend Forwardable
|
7
7
|
attr_reader :ocf
|
8
|
-
attr_writer :package
|
9
8
|
attr_accessor :epub_file
|
10
9
|
|
11
10
|
# When writing, sets +ocf.book+ to self.
|
@@ -27,19 +26,12 @@ module EPUB
|
|
27
26
|
end
|
28
27
|
alias renditions packages
|
29
28
|
|
30
|
-
# Syntax sugar.
|
31
|
-
# Returns package set by +package=+.
|
32
|
-
# Returns default rendition if any package has not been set ever.
|
33
|
-
# @return [Publication::Package]
|
34
|
-
def package
|
35
|
-
@package || default_rendition
|
36
|
-
end
|
37
|
-
|
38
29
|
# First +package+ in +packages+
|
39
30
|
# @return [Package|nil]
|
40
31
|
def default_rendition
|
41
32
|
packages.first
|
42
33
|
end
|
34
|
+
alias package default_rendition
|
43
35
|
|
44
36
|
# @!parse def_delegators :package, :metadata, :manifest, :spine, :guide, :bindings
|
45
37
|
def_delegators :package, *Publication::Package::CONTENT_MODELS
|
data/lib/epub/metadata.rb
CHANGED
@@ -7,6 +7,8 @@ module EPUB
|
|
7
7
|
DC_ELEMS = [:identifiers, :titles, :languages] +
|
8
8
|
[:contributors, :coverages, :creators, :dates, :descriptions, :formats, :publishers,
|
9
9
|
:relations, :rights, :sources, :subjects, :types]
|
10
|
+
# Used for CFI
|
11
|
+
attr_reader :children
|
10
12
|
attr_accessor :package, :unique_identifier, :metas, :links,
|
11
13
|
*(DC_ELEMS.collect {|elem| "dc_#{elem}"})
|
12
14
|
DC_ELEMS.each do |elem|
|
@@ -18,6 +20,7 @@ module EPUB
|
|
18
20
|
(DC_ELEMS + [:metas, :links]).each do |elem|
|
19
21
|
__send__ "#{elem}=", []
|
20
22
|
end
|
23
|
+
@children = []
|
21
24
|
end
|
22
25
|
|
23
26
|
def release_identifier
|
@@ -174,5 +177,13 @@ module EPUB
|
|
174
177
|
@refines = refinee
|
175
178
|
end
|
176
179
|
end
|
180
|
+
|
181
|
+
class UnsupportedModel
|
182
|
+
attr_accessor :raw_element
|
183
|
+
|
184
|
+
def initialize(raw_element)
|
185
|
+
@raw_element = raw_element
|
186
|
+
end
|
187
|
+
end
|
177
188
|
end
|
178
189
|
end
|
data/lib/epub/parser.rb
CHANGED
data/lib/epub/parser/metadata.rb
CHANGED
@@ -5,62 +5,95 @@ module EPUB
|
|
5
5
|
metadata = EPUB::Publication::Package::Metadata.new
|
6
6
|
id_map = {}
|
7
7
|
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
metadata.titles = extract_model(elem, id_map, './dc:title', :Title)
|
13
|
-
metadata.languages = extract_model(elem, id_map, './dc:language', :DCMES, %w[id])
|
14
|
-
%w[contributor coverage creator date description format publisher relation source subject type].each do |dcmes|
|
15
|
-
metadata.__send__ "#{dcmes}s=", extract_model(elem, id_map, "./dc:#{dcmes}")
|
16
|
-
end
|
17
|
-
metadata.rights = extract_model(elem, id_map, './dc:rights')
|
18
|
-
metadata.metas = extract_refinee(elem, id_map, "./#{default_namespace}:meta", :Meta, %w[property id scheme])
|
19
|
-
metadata.links = extract_refinee(elem, id_map, "./#{default_namespace}:link", :Link, %w[id media-type]) {|link, e|
|
20
|
-
link.href = extract_attribute(e, 'href')
|
21
|
-
link.rel = Set.new(extract_attribute(e, 'rel').split(nil))
|
22
|
-
}
|
8
|
+
default_namespace_uri = EPUB::NAMESPACES[default_namespace]
|
9
|
+
elem.element_children.each do |child|
|
10
|
+
namespace_uri = child.namespace && child.namespace.href
|
11
|
+
elem_name = child.name
|
23
12
|
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
13
|
+
model =
|
14
|
+
case namespace_uri
|
15
|
+
when EPUB::NAMESPACES['dc']
|
16
|
+
case elem_name
|
17
|
+
when 'identifier'
|
18
|
+
identifier = build_model(child, :Identifier, ['id'])
|
19
|
+
metadata.identifiers << identifier
|
20
|
+
identifier.scheme = extract_attribute(child, 'scheme', 'opf')
|
21
|
+
identifier
|
22
|
+
when 'title'
|
23
|
+
title = build_model(child, :Title)
|
24
|
+
metadata.titles << title
|
25
|
+
title
|
26
|
+
when 'language'
|
27
|
+
language = build_model(child, :DCMES, ['id'])
|
28
|
+
metadata.languages << language
|
29
|
+
language
|
30
|
+
when 'title', 'contributor', 'coverage', 'creator', 'date', 'description', 'format', 'publisher', 'relation', 'source', 'subject', 'rights', 'type'
|
31
|
+
attr = elem_name == 'rights' ? elem_name : elem_name + 's'
|
32
|
+
dcmes = build_model(child)
|
33
|
+
metadata.__send__(attr) << dcmes
|
34
|
+
dcmes
|
35
|
+
else
|
36
|
+
build_unsupported_model(child)
|
37
|
+
end
|
38
|
+
when default_namespace_uri
|
39
|
+
case elem_name
|
40
|
+
when 'meta'
|
41
|
+
meta = build_model(child, :Meta, %w[property id scheme])
|
42
|
+
metadata.metas << meta
|
43
|
+
meta
|
44
|
+
when 'link'
|
45
|
+
link = build_model(child, :Link, %w[id media-type])
|
46
|
+
metadata.links << link
|
47
|
+
link.href = extract_attribute(child, 'href')
|
48
|
+
link.rel = Set.new(extract_attribute(child, 'rel').split(/\s+/))
|
49
|
+
link
|
50
|
+
else
|
51
|
+
build_unsupported_model(child)
|
52
|
+
end
|
53
|
+
else
|
54
|
+
build_unsupported_model(child)
|
55
|
+
end
|
29
56
|
|
30
|
-
|
31
|
-
end
|
57
|
+
metadata.children << model
|
32
58
|
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
attributes.each do |attr|
|
37
|
-
model.__send__ "#{attr.gsub(/-/, '_')}=", extract_attribute(e, attr)
|
59
|
+
if model.kind_of?(EPUB::Metadata::Identifier) &&
|
60
|
+
model.id == unique_identifier_id
|
61
|
+
metadata.unique_identifier = model
|
38
62
|
end
|
39
|
-
model.content = e.content unless klass == :Link
|
40
63
|
|
41
|
-
|
64
|
+
if model.respond_to?(:id) && model.id
|
65
|
+
id_map[model.id] = {refinee: model}
|
66
|
+
end
|
42
67
|
|
43
|
-
|
68
|
+
refines = extract_attribute(child, 'refines')
|
69
|
+
if refines && refines.start_with?('#')
|
70
|
+
id = refines[1..-1]
|
71
|
+
id_map[id] ||= {}
|
72
|
+
id_map[id][:refiners] ||= []
|
73
|
+
id_map[id][:refiners] << model
|
74
|
+
end
|
44
75
|
end
|
45
76
|
|
46
|
-
|
47
|
-
|
77
|
+
id_map.values.each do |hsh|
|
78
|
+
next unless hsh[:refiners]
|
79
|
+
next unless hsh[:refinee]
|
80
|
+
hsh[:refiners].each {|meta| meta.refines = hsh[:refinee]}
|
48
81
|
end
|
49
82
|
|
50
|
-
|
83
|
+
metadata
|
51
84
|
end
|
52
85
|
|
53
|
-
def
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
63
|
-
|
86
|
+
def build_model(elem, klass=:DCMES, attributes=%w[id lang dir])
|
87
|
+
model = EPUB::Metadata.const_get(klass).new
|
88
|
+
attributes.each do |attr|
|
89
|
+
model.__send__ "#{attr.gsub('-', '_')}=", extract_attribute(elem, attr)
|
90
|
+
end
|
91
|
+
model.content = elem.content unless klass == :Link
|
92
|
+
model
|
93
|
+
end
|
94
|
+
|
95
|
+
def build_unsupported_model(elem)
|
96
|
+
EPUB::Metadata::UnsupportedModel.new(elem)
|
64
97
|
end
|
65
98
|
end
|
66
99
|
end
|
data/lib/epub/parser/version.rb
CHANGED
@@ -1,5 +1,6 @@
|
|
1
1
|
# -*- coding: utf-8 -*-
|
2
2
|
require File.expand_path 'helper', File.dirname(__FILE__)
|
3
|
+
require 'zipruby'
|
3
4
|
|
4
5
|
class TestParserPublication < Test::Unit::TestCase
|
5
6
|
def setup
|
@@ -75,6 +76,14 @@ class TestParserPublication < Test::Unit::TestCase
|
|
75
76
|
assert titles[2] < titles[3]
|
76
77
|
assert titles[3] > titles[4]
|
77
78
|
end
|
79
|
+
|
80
|
+
def test_children_keeps_order_in_xml
|
81
|
+
expected = @metadata.links.find {|link|
|
82
|
+
link.rel.include?('foaf:homepage') &&
|
83
|
+
link.href.to_s == 'http://example.org/book-info/12389347'
|
84
|
+
}
|
85
|
+
assert_equal expected, @metadata.children[27]
|
86
|
+
end
|
78
87
|
end
|
79
88
|
|
80
89
|
class TestParseManifest < TestParserPublication
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: epub-parser
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.7
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- KITAITI Makoto
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-
|
11
|
+
date: 2016-07-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rake
|
@@ -317,6 +317,7 @@ files:
|
|
317
317
|
- docs/FixedLayout.markdown
|
318
318
|
- docs/Home.markdown
|
319
319
|
- docs/Item.markdown
|
320
|
+
- docs/MultipleRenditions.markdown
|
320
321
|
- docs/Navigation.markdown
|
321
322
|
- docs/Publication.markdown
|
322
323
|
- docs/Searcher.markdown
|