epub-parser 0.1.8 → 0.1.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -1
- data/CHANGELOG.markdown +12 -0
- data/README.markdown +11 -27
- data/Rakefile +1 -14
- data/docs/Home.markdown +2 -2
- data/docs/Searcher.markdown +47 -27
- data/epub-parser.gemspec +3 -6
- data/lib/epub.rb +0 -9
- data/lib/epub/book/features.rb +3 -3
- data/lib/epub/content_document/navigation.rb +15 -4
- data/lib/epub/content_document/xhtml.rb +2 -1
- data/lib/epub/parser/content_document.rb +1 -1
- data/lib/epub/parser/publication.rb +5 -5
- data/lib/epub/parser/version.rb +1 -1
- data/lib/epub/publication/package.rb +1 -1
- data/lib/epub/publication/package/guide.rb +3 -5
- data/lib/epub/publication/package/manifest.rb +18 -5
- data/lib/epub/publication/package/metadata.rb +4 -4
- data/lib/epub/publication/package/spine.rb +1 -1
- data/lib/epub/searcher.rb +10 -0
- data/lib/epub/searcher/publication.rb +6 -4
- data/lib/epub/searcher/result.rb +31 -0
- data/lib/epub/searcher/xhtml.rb +113 -17
- data/test/test_content_document.rb +21 -0
- data/test/test_inspect.rb +1 -1
- data/test/test_publication.rb +55 -2
- data/test/test_searcher.rb +45 -20
- metadata +18 -46
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA1:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 1b57da74df66cba76e58cbb0098d7d5618ec1188
|
|
4
|
+
data.tar.gz: 66d1bf92e61f15da35215d0d46638ec7801e5993
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 2a499a7de09c4b906b84d63e10e104f3dd00d028f8e4700a979343cbcade7edd06f63d3651422070a55139de64461b4fcf5b3edb46de477d8fe1d5e675509a01
|
|
7
|
+
data.tar.gz: 2b223250e08e3e9061042bbcd7b9e36662ac86542a7144e84bac71081f18b7b591fcf0acb03c6cfec14ac24b1b3fa256ede7ff2fedb6c0d0807ec14a7267387a
|
data/.travis.yml
CHANGED
data/CHANGELOG.markdown
CHANGED
|
@@ -1,6 +1,18 @@
|
|
|
1
1
|
CHANGELOG
|
|
2
2
|
=========
|
|
3
3
|
|
|
4
|
+
0.1.9
|
|
5
|
+
-----
|
|
6
|
+
|
|
7
|
+
* Introduce [Nokogumbo][] for XHTML Content Documents
|
|
8
|
+
* Stop support for Ruby 1.9
|
|
9
|
+
* Remove `EPUB.included` method. Now including `EPUB` module empowers nothing of EPUB features. Include `EPUB::Book::Features` instead.
|
|
10
|
+
* Add `EPUB::Searcher::XHTML::Seamless` and make it default searcher
|
|
11
|
+
* Add `EPUB::Publication::Package::Manifest#each_nav`
|
|
12
|
+
* Stop to use enumerabler gem
|
|
13
|
+
|
|
14
|
+
[nokogumbo]: https://github.com/rubys/nokogumbo/
|
|
15
|
+
|
|
4
16
|
0.1.8
|
|
5
17
|
-----
|
|
6
18
|
|
data/README.markdown
CHANGED
|
@@ -92,7 +92,7 @@ See {file:docs/EpubOpen} for more info.
|
|
|
92
92
|
|
|
93
93
|
REQUIREMENTS
|
|
94
94
|
------------
|
|
95
|
-
* Ruby
|
|
95
|
+
* Ruby 2.0.0 or later
|
|
96
96
|
* `patch` command to install Nokogiri
|
|
97
97
|
* C compiler to compile Zip/Ruby and Nokogiri
|
|
98
98
|
|
|
@@ -110,6 +110,16 @@ If you find other gems, please tell me or request a pull request.
|
|
|
110
110
|
RECENT CHANGES
|
|
111
111
|
--------------
|
|
112
112
|
|
|
113
|
+
### 0.1.9
|
|
114
|
+
|
|
115
|
+
* Introduce [Nokogumbo][] for XHTML Content Documents
|
|
116
|
+
* Stop support for Ruby 1.9
|
|
117
|
+
* Remove `EPUB.included` method. Now including `EPUB` module empowers nothing of EPUB features. Include `EPUB::Book::Features` instead.
|
|
118
|
+
* Add `EPUB::Searcher::XHTML::Seamless` and make it default searcher
|
|
119
|
+
* Add `EPUB::Publication::Package::Manifest#each_nav`
|
|
120
|
+
|
|
121
|
+
[nokogumbo]: https://github.com/rubys/nokogumbo/
|
|
122
|
+
|
|
113
123
|
### 0.1.8
|
|
114
124
|
|
|
115
125
|
* Explicity #close each zip member file that has been opened via #fopen(Thanks [xunker][]!)
|
|
@@ -125,32 +135,6 @@ RECENT CHANGES
|
|
|
125
135
|
* [Experimental]Add `EPUB::Searcher` module. See {file:Searcher.markdown} for details
|
|
126
136
|
* Detect and set character encoding in `EPUB::Publication::Package::Item#read`
|
|
127
137
|
|
|
128
|
-
### 0.1.6
|
|
129
|
-
* Remove `EPUB.parse` method
|
|
130
|
-
* Remove `EPUB::Publication::Package::Metadata#to_hash`
|
|
131
|
-
* Add `EPUB::Publication::Package::Metadata::Identifier`
|
|
132
|
-
* Remove `MethodDecorators::Deprecated`
|
|
133
|
-
* Make `EPUB::Parser::OCF::CONTAINER_FILE` and other constants deprecated
|
|
134
|
-
* Make `EPUB::Publication::Package::Metadata::Link#rel` a `Set`
|
|
135
|
-
* Add exception class `EPUB::Constants::MediaType::UnsupportedMediaType`
|
|
136
|
-
* Make `EPUB::Constants::MediaType::UnsupportedError` deprecated
|
|
137
|
-
* Add `EPUB::Publication::Package::Item#find_item_by_relative_iri`
|
|
138
|
-
* Add `EPUB::Publication::Package::Item#cover_image?`
|
|
139
|
-
* Add `EPUB::Book::Features` module and move methods of `EPUB` module to it.(Thanks, [takahashim][]!)
|
|
140
|
-
* Make including `EPUB` deprecated
|
|
141
|
-
* Parse `hidden` attribute of `nav` elements
|
|
142
|
-
* [Experimental]Add `EPUB::ContentDocument::Navigation::Item#traverse`
|
|
143
|
-
|
|
144
|
-
[takahashim]: https://github.com/takahashim
|
|
145
|
-
|
|
146
|
-
### 0.1.5
|
|
147
|
-
* Add `ContentDocument::XHTML#title`
|
|
148
|
-
* Add `Manifest::Item#xhtml?`
|
|
149
|
-
* Add `--words` and `--char` options to `epubinfo` command
|
|
150
|
-
* API change: `OCF::Container::Rootfile#full_path` became Addressable::URI object rather than `String`
|
|
151
|
-
* Add `ContentDocument::XHTML#rexml` and `#nokogiri`
|
|
152
|
-
* Inspect more readably
|
|
153
|
-
|
|
154
138
|
See {file:CHANGELOG.markdown} for older changelogs and details.
|
|
155
139
|
|
|
156
140
|
TODOS
|
data/Rakefile
CHANGED
|
@@ -51,18 +51,5 @@ namespace :doc do
|
|
|
51
51
|
end
|
|
52
52
|
|
|
53
53
|
namespace :gem do
|
|
54
|
-
|
|
55
|
-
task :build do
|
|
56
|
-
Bundler::GemHelper.new.build_gem
|
|
57
|
-
end
|
|
58
|
-
|
|
59
|
-
desc "Build and install epub-parser-#{EPUB::Parser::VERSION}.gem into system gems."
|
|
60
|
-
task :install do
|
|
61
|
-
Bundler::GemHelper.new.install_gem
|
|
62
|
-
end
|
|
63
|
-
|
|
64
|
-
desc "Create tag v#{EPUB::Parser::VERSION} and build and push epub-parser-#{EPUB::Parser::VERSION}.gem to Rubygems"
|
|
65
|
-
task :release => :test do
|
|
66
|
-
Bundler::GemHelper.new.release_gem
|
|
67
|
-
end
|
|
54
|
+
Bundler::GemHelper.install_tasks
|
|
68
55
|
end
|
data/docs/Home.markdown
CHANGED
|
@@ -56,7 +56,7 @@ And {EPUB::Publication::Package::Manifest::Item Item} provides syntax suger {EPU
|
|
|
56
56
|
|
|
57
57
|
For several utilities of Item, see {file:docs/Item.markdown} page.
|
|
58
58
|
|
|
59
|
-
By the way, although `book` above is a {EPUB::Book} object, all features are provided by {EPUB} module. Therefore YourBook class can include the features of {EPUB}:
|
|
59
|
+
By the way, although `book` above is a {EPUB::Book} object, all features are provided by {EPUB::Book::Features} module. Therefore YourBook class can include the features of {EPUB::Book::Features}:
|
|
60
60
|
|
|
61
61
|
require 'epub'
|
|
62
62
|
|
|
@@ -99,7 +99,7 @@ More documentations are avaiable in:
|
|
|
99
99
|
Requirements
|
|
100
100
|
------------
|
|
101
101
|
|
|
102
|
-
* Ruby
|
|
102
|
+
* Ruby 2.0.0 or later
|
|
103
103
|
* C compiler to compile Zip/Ruby and Nokogiri
|
|
104
104
|
|
|
105
105
|
Note
|
data/docs/Searcher.markdown
CHANGED
|
@@ -10,35 +10,35 @@ Example
|
|
|
10
10
|
|
|
11
11
|
epub = EPUB::Parser.parse('childrens-literature-20130206.epub')
|
|
12
12
|
search_word = 'INTRODUCTORY'
|
|
13
|
-
results = EPUB::Searcher.search(epub
|
|
14
|
-
# => [#<EPUB::Searcher::Result:
|
|
15
|
-
# @end_steps=[#<EPUB::Searcher::Result::Step:
|
|
13
|
+
results = EPUB::Searcher.search(epub, search_word)
|
|
14
|
+
# => [#<EPUB::Searcher::Result:0x007f938ed517a8
|
|
15
|
+
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f938ed51a50 @index=12, @info={}, @type=:character>],
|
|
16
16
|
# @parent_steps=
|
|
17
|
-
# [#<EPUB::Searcher::Result::Step:
|
|
18
|
-
#
|
|
19
|
-
#
|
|
20
|
-
#
|
|
21
|
-
#
|
|
22
|
-
#
|
|
23
|
-
#
|
|
24
|
-
#
|
|
25
|
-
#
|
|
26
|
-
#
|
|
27
|
-
# @start_steps=[#<EPUB::Searcher::Result::Step:
|
|
28
|
-
# #<EPUB::Searcher::Result:
|
|
29
|
-
# @end_steps=[#<EPUB::Searcher::Result::Step:
|
|
17
|
+
# [#<EPUB::Searcher::Result::Step:0x007f938f1c1e78 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
|
|
18
|
+
# #<EPUB::Searcher::Result::Step:0x007f938f1caa78 @index=1, @info={:id=>nil}, @type=:itemref>,
|
|
19
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed521d0 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
|
|
20
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed52158 @index=0, @info={:name=>"nav", :id=>"toc"}, @type=:element>,
|
|
21
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed52108 @index=1, @info={:name=>"ol", :id=>"tocList"}, @type=:element>,
|
|
22
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed52090 @index=0, @info={:name=>"li", :id=>"np-313"}, @type=:element>,
|
|
23
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed52040 @index=1, @info={:name=>"ol", :id=>nil}, @type=:element>,
|
|
24
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed51ff0 @index=1, @info={:name=>"li", :id=>"np-317"}, @type=:element>,
|
|
25
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed51f78 @index=0, @info={:name=>"a", :id=>nil}, @type=:element>,
|
|
26
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed51f28 @index=0, @info={}, @type=:text>],
|
|
27
|
+
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f938ed51e88 @index=0, @info={}, @type=:character>]>,
|
|
28
|
+
# #<EPUB::Searcher::Result:0x007f938ef8f5d8
|
|
29
|
+
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f938ef8f808 @index=12, @info={}, @type=:character>],
|
|
30
30
|
# @parent_steps=
|
|
31
|
-
# [#<EPUB::Searcher::Result::Step:
|
|
32
|
-
#
|
|
33
|
-
#
|
|
34
|
-
#
|
|
35
|
-
#
|
|
36
|
-
#
|
|
37
|
-
#
|
|
38
|
-
# @start_steps=[#<EPUB::Searcher::Result::Step:
|
|
31
|
+
# [#<EPUB::Searcher::Result::Step:0x007f938f1c1e78 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
|
|
32
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ed51730 @index=2, @info={:id=>nil}, @type=:itemref>,
|
|
33
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ef8fce0 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
|
|
34
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ef8fc90 @index=0, @info={:name=>"section", :id=>"pgepubid00492"}, @type=:element>,
|
|
35
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ef8fc40 @index=3, @info={:name=>"section", :id=>"pgepubid00498"}, @type=:element>,
|
|
36
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ef8fbf0 @index=1, @info={:name=>"h3", :id=>nil}, @type=:element>,
|
|
37
|
+
# #<EPUB::Searcher::Result::Step:0x007f938ef8fb28 @index=0, @info={}, @type=:text>],
|
|
38
|
+
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f938ef8fa88 @index=0, @info={}, @type=:character>]>]
|
|
39
39
|
puts results.collect(&:to_cfi_s)
|
|
40
|
-
# /6/4!/4/2/4/2/4/4/2/1,:0,:12
|
|
41
|
-
# /6/6!/4/2/8/4/1,:0,:12
|
|
40
|
+
# /6/4!/4/2[toc]/4[tocList]/2[np-313]/4/4[np-317]/2/1,:0,:12
|
|
41
|
+
# /6/6!/4/2[pgepubid00492]/8[pgepubid00498]/4/1,:0,:12
|
|
42
42
|
# => nil
|
|
43
43
|
|
|
44
44
|
Search result
|
|
@@ -46,10 +46,26 @@ Search result
|
|
|
46
46
|
|
|
47
47
|
Search result is an array of {EPUB::Searcher::Result} and it may be converted to an EPUBCFI string by {EPUB::Searcher::Result#to_cfi_s}.
|
|
48
48
|
|
|
49
|
+
Seamless XHTML Searcher
|
|
50
|
+
-----------------------
|
|
51
|
+
|
|
52
|
+
Now default searcher for XHTML is *seamless* searcher, which ignores tags when searching.
|
|
53
|
+
|
|
54
|
+
You can search words 'search word' from XHTML document below:
|
|
55
|
+
|
|
56
|
+
<html>
|
|
57
|
+
<head>
|
|
58
|
+
<title>Sample document</title>
|
|
59
|
+
</head>
|
|
60
|
+
<body>
|
|
61
|
+
<p><em>search</em> word</p>
|
|
62
|
+
</body>
|
|
63
|
+
</html>
|
|
64
|
+
|
|
49
65
|
Restricted XHTML Searcher
|
|
50
66
|
-------------------------
|
|
51
67
|
|
|
52
|
-
|
|
68
|
+
You can also use *restricted* searcher, which means that it can search from only single elements. For instance, it can find 'search word' from XHTML document below:
|
|
53
69
|
|
|
54
70
|
<html>
|
|
55
71
|
<head>
|
|
@@ -72,3 +88,7 @@ But cannot from document below:
|
|
|
72
88
|
</html>
|
|
73
89
|
|
|
74
90
|
because the words 'search' and 'word' are not in the same element.
|
|
91
|
+
|
|
92
|
+
To use restricted searcher, specify `algorithm` option for `search` method:
|
|
93
|
+
|
|
94
|
+
results = EPUB::Searcher.search(epub, search_word, algorithm: :restricted)
|
data/epub-parser.gemspec
CHANGED
|
@@ -11,8 +11,7 @@ Gem::Specification.new do |s|
|
|
|
11
11
|
s.summary = %q{EPUB 3 Parser}
|
|
12
12
|
s.description = %q{Parse EPUB 3 book loosely}
|
|
13
13
|
s.license = 'MIT'
|
|
14
|
-
|
|
15
|
-
# s.rubyforge_project = "epub-parser"
|
|
14
|
+
s.required_ruby_version = '> 2'
|
|
16
15
|
|
|
17
16
|
s.files = `git ls-files`.split("\n")
|
|
18
17
|
.push('test/fixtures/book/OPS/ルートファイル.opf')
|
|
@@ -38,13 +37,11 @@ Gem::Specification.new do |s|
|
|
|
38
37
|
s.add_development_dependency 'gem-man'
|
|
39
38
|
s.add_development_dependency 'ronn'
|
|
40
39
|
s.add_development_dependency 'epzip'
|
|
41
|
-
s.add_development_dependency 'epubcheck'
|
|
42
|
-
s.add_development_dependency 'epub_validator'
|
|
43
40
|
s.add_development_dependency 'aruba'
|
|
44
41
|
|
|
45
|
-
s.add_runtime_dependency 'enumerabler'
|
|
46
42
|
s.add_runtime_dependency 'zipruby'
|
|
47
43
|
s.add_runtime_dependency 'nokogiri', '~> 1.6'
|
|
44
|
+
s.add_runtime_dependency 'nokogumbo'
|
|
48
45
|
s.add_runtime_dependency 'addressable', '>= 2.3.5'
|
|
49
|
-
s.add_runtime_dependency 'rchardet'
|
|
46
|
+
s.add_runtime_dependency 'rchardet', '< 1.6'
|
|
50
47
|
end
|
data/lib/epub.rb
CHANGED
|
@@ -3,12 +3,3 @@ require 'epub/ocf'
|
|
|
3
3
|
require 'epub/publication'
|
|
4
4
|
require 'epub/content_document'
|
|
5
5
|
require 'epub/book/features'
|
|
6
|
-
|
|
7
|
-
module EPUB
|
|
8
|
-
class << self
|
|
9
|
-
def included(base)
|
|
10
|
-
warn 'Including EPUB module is deprecated. Include EPUB::Book::Features instead.'
|
|
11
|
-
base.__send__ :include, EPUB::Book::Features
|
|
12
|
-
end
|
|
13
|
-
end
|
|
14
|
-
end
|
data/lib/epub/book/features.rb
CHANGED
|
@@ -17,7 +17,7 @@ module EPUB
|
|
|
17
17
|
end
|
|
18
18
|
end
|
|
19
19
|
|
|
20
|
-
%w[
|
|
20
|
+
%w[title main_title subtitle short_title collection_title edition_title extended_title description date unique_identifier].each do |met|
|
|
21
21
|
define_method met do
|
|
22
22
|
metadata.__send__(met)
|
|
23
23
|
end
|
|
@@ -25,7 +25,7 @@ module EPUB
|
|
|
25
25
|
|
|
26
26
|
%w[nav].each do |met|
|
|
27
27
|
define_method met do
|
|
28
|
-
manifest.__send__
|
|
28
|
+
manifest.__send__(met)
|
|
29
29
|
end
|
|
30
30
|
end
|
|
31
31
|
|
|
@@ -39,7 +39,7 @@ module EPUB
|
|
|
39
39
|
if block_given?
|
|
40
40
|
enum.each &blk
|
|
41
41
|
else
|
|
42
|
-
enum
|
|
42
|
+
enum.each
|
|
43
43
|
end
|
|
44
44
|
end
|
|
45
45
|
|
|
@@ -9,20 +9,20 @@ module EPUB
|
|
|
9
9
|
end
|
|
10
10
|
|
|
11
11
|
def toc
|
|
12
|
-
navigations.
|
|
12
|
+
navigations.find(&:toc?)
|
|
13
13
|
end
|
|
14
14
|
|
|
15
15
|
def page_list
|
|
16
|
-
navigations.
|
|
16
|
+
navigations.find(&:page_list?)
|
|
17
17
|
end
|
|
18
18
|
|
|
19
19
|
def landmarks
|
|
20
|
-
navigations.
|
|
20
|
+
navigations.find(&:landmarks?)
|
|
21
21
|
end
|
|
22
22
|
|
|
23
23
|
# Enumerator version of toc
|
|
24
|
-
# Usage: nagivation.enum_for(:contents)
|
|
25
24
|
def contents
|
|
25
|
+
enum_for(:each_content).to_a
|
|
26
26
|
end
|
|
27
27
|
|
|
28
28
|
# Enumerator version of page_list
|
|
@@ -30,8 +30,13 @@ module EPUB
|
|
|
30
30
|
def pages
|
|
31
31
|
end
|
|
32
32
|
|
|
33
|
+
# @todo Enumerator version of landmarks
|
|
34
|
+
|
|
33
35
|
# iterator for #toc
|
|
34
36
|
def each_content
|
|
37
|
+
toc.traverse do |content, _|
|
|
38
|
+
yield content
|
|
39
|
+
end
|
|
35
40
|
end
|
|
36
41
|
|
|
37
42
|
# iterator for #page_list
|
|
@@ -89,6 +94,12 @@ module EPUB
|
|
|
89
94
|
alias navigations= items=
|
|
90
95
|
alias heading text
|
|
91
96
|
alias heading= text=
|
|
97
|
+
|
|
98
|
+
%w[toc page_list landmarks].each do |type|
|
|
99
|
+
define_method "#{type}?" do
|
|
100
|
+
type == Type.const_get(type.upcase)
|
|
101
|
+
end
|
|
102
|
+
end
|
|
92
103
|
end
|
|
93
104
|
|
|
94
105
|
class ItemList < Array
|
|
@@ -87,7 +87,7 @@ module EPUB
|
|
|
87
87
|
item.text = extract_attribute(a_or_span, 'title').to_s if item.text.nil? || item.text.empty?
|
|
88
88
|
end
|
|
89
89
|
item.href = Addressable::URI.parse(extract_attribute(a_or_span, 'href'))
|
|
90
|
-
item.item = @item.manifest.items.
|
|
90
|
+
item.item = @item.manifest.items.find {|it| it.href.request_uri == item.href.request_uri}
|
|
91
91
|
end
|
|
92
92
|
item.items = element.xpath('./xhtml:ol[1]/xhtml:li', EPUB::NAMESPACES).map {|li| parse_navigation_item(li)}
|
|
93
93
|
|
|
@@ -55,7 +55,7 @@ module EPUB
|
|
|
55
55
|
}
|
|
56
56
|
metadata.titles = extract_model(elem, id_map, './dc:title', :Title)
|
|
57
57
|
metadata.languages = extract_model(elem, id_map, './dc:language', :DCMES, %w[id])
|
|
58
|
-
%w[
|
|
58
|
+
%w[contributor coverage creator date description format publisher relation source subject type].each do |dcmes|
|
|
59
59
|
metadata.__send__ "#{dcmes}s=", extract_model(elem, id_map, "./dc:#{dcmes}")
|
|
60
60
|
end
|
|
61
61
|
metadata.rights = extract_model(elem, id_map, './dc:rights')
|
|
@@ -82,7 +82,7 @@ module EPUB
|
|
|
82
82
|
fallback_map = {}
|
|
83
83
|
elem.xpath('./opf:item', EPUB::NAMESPACES).each do |e|
|
|
84
84
|
item = EPUB::Publication::Package::Manifest::Item.new
|
|
85
|
-
%w[
|
|
85
|
+
%w[id media-type media-overlay].each do |attr|
|
|
86
86
|
item.__send__ "#{attr.gsub(/-/, '_')}=", extract_attribute(e, attr)
|
|
87
87
|
end
|
|
88
88
|
item.href = Addressable::URI.parse(extract_attribute(e, 'href'))
|
|
@@ -102,13 +102,13 @@ module EPUB
|
|
|
102
102
|
def parse_spine
|
|
103
103
|
spine = @package.spine = EPUB::Publication::Package::Spine.new
|
|
104
104
|
elem = @doc.xpath('/opf:package/opf:spine', EPUB::NAMESPACES).first
|
|
105
|
-
%w[
|
|
105
|
+
%w[id toc page-progression-direction].each do |attr|
|
|
106
106
|
spine.__send__ "#{attr.gsub(/-/, '_')}=", extract_attribute(elem, attr)
|
|
107
107
|
end
|
|
108
108
|
|
|
109
109
|
elem.xpath('./opf:itemref', EPUB::NAMESPACES).each do |e|
|
|
110
110
|
itemref = EPUB::Publication::Package::Spine::Itemref.new
|
|
111
|
-
%w[
|
|
111
|
+
%w[idref id].each do |attr|
|
|
112
112
|
itemref.__send__ "#{attr}=", extract_attribute(e, attr)
|
|
113
113
|
end
|
|
114
114
|
itemref.linear = (extract_attribute(e, 'linear') != 'no')
|
|
@@ -124,7 +124,7 @@ module EPUB
|
|
|
124
124
|
guide = @package.guide = EPUB::Publication::Package::Guide.new
|
|
125
125
|
@doc.xpath('/opf:package/opf:guide/opf:reference', EPUB::NAMESPACES).each do |ref|
|
|
126
126
|
reference = EPUB::Publication::Package::Guide::Reference.new
|
|
127
|
-
%w[
|
|
127
|
+
%w[type title].each do |attr|
|
|
128
128
|
reference.__send__ "#{attr}=", extract_attribute(ref, attr)
|
|
129
129
|
end
|
|
130
130
|
reference.href = Addressable::URI.parse(extract_attribute(ref, 'href'))
|
data/lib/epub/parser/version.rb
CHANGED
|
@@ -1,5 +1,3 @@
|
|
|
1
|
-
require 'enumerabler'
|
|
2
|
-
|
|
3
1
|
module EPUB
|
|
4
2
|
module Publication
|
|
5
3
|
class Package
|
|
@@ -29,9 +27,9 @@ module EPUB
|
|
|
29
27
|
return @item if @item
|
|
30
28
|
|
|
31
29
|
request_uri = href.request_uri
|
|
32
|
-
@item = @guide.package.manifest.items.
|
|
30
|
+
@item = @guide.package.manifest.items.find {|item|
|
|
33
31
|
item.href.request_uri == request_uri
|
|
34
|
-
|
|
32
|
+
}
|
|
35
33
|
end
|
|
36
34
|
end
|
|
37
35
|
|
|
@@ -41,7 +39,7 @@ module EPUB
|
|
|
41
39
|
var = instance_variable_get "@#{method_name}"
|
|
42
40
|
return var if var
|
|
43
41
|
|
|
44
|
-
var = references.
|
|
42
|
+
var = references.find {|ref| ref.type == type}
|
|
45
43
|
instance_variable_set "@#{method_name}", var
|
|
46
44
|
end
|
|
47
45
|
end
|
|
@@ -1,5 +1,4 @@
|
|
|
1
1
|
require 'set'
|
|
2
|
-
require 'enumerabler'
|
|
3
2
|
require 'rchardet'
|
|
4
3
|
require 'epub/constants'
|
|
5
4
|
require 'epub/parser/content_document'
|
|
@@ -24,8 +23,18 @@ module EPUB
|
|
|
24
23
|
self
|
|
25
24
|
end
|
|
26
25
|
|
|
26
|
+
def each_nav
|
|
27
|
+
if block_given?
|
|
28
|
+
each_item do |item|
|
|
29
|
+
yield item if item.nav?
|
|
30
|
+
end
|
|
31
|
+
else
|
|
32
|
+
each_item.lazy.select(&:nav?)
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
|
|
27
36
|
def navs
|
|
28
|
-
items.
|
|
37
|
+
items.select(&:nav?)
|
|
29
38
|
end
|
|
30
39
|
|
|
31
40
|
def nav
|
|
@@ -33,12 +42,16 @@ module EPUB
|
|
|
33
42
|
end
|
|
34
43
|
|
|
35
44
|
def cover_image
|
|
36
|
-
items.
|
|
45
|
+
items.select(&:cover_image?).first
|
|
37
46
|
end
|
|
38
47
|
|
|
39
48
|
def each_item
|
|
40
|
-
|
|
41
|
-
|
|
49
|
+
if block_given?
|
|
50
|
+
@items.each_value do |item|
|
|
51
|
+
yield item
|
|
52
|
+
end
|
|
53
|
+
else
|
|
54
|
+
@items.each_value
|
|
42
55
|
end
|
|
43
56
|
end
|
|
44
57
|
|
|
@@ -30,7 +30,7 @@ module EPUB
|
|
|
30
30
|
titles.sort.join("\n")
|
|
31
31
|
end
|
|
32
32
|
|
|
33
|
-
%w[
|
|
33
|
+
%w[main short collection edition extended].each do |type|
|
|
34
34
|
define_method "#{type}_title" do
|
|
35
35
|
titles.select {|title| title.title_type.to_s == type}.sort.join(' ')
|
|
36
36
|
end
|
|
@@ -41,7 +41,7 @@ module EPUB
|
|
|
41
41
|
end
|
|
42
42
|
|
|
43
43
|
def description
|
|
44
|
-
descriptions.join
|
|
44
|
+
descriptions.join(' ')
|
|
45
45
|
end
|
|
46
46
|
|
|
47
47
|
def date
|
|
@@ -64,7 +64,7 @@ module EPUB
|
|
|
64
64
|
end
|
|
65
65
|
|
|
66
66
|
module Refinee
|
|
67
|
-
PROPERTIES = %w[
|
|
67
|
+
PROPERTIES = %w[alternate-script display-seq file-as group-position identifier-type meta-auth role title-type]
|
|
68
68
|
|
|
69
69
|
attr_writer :refiners
|
|
70
70
|
|
|
@@ -76,7 +76,7 @@ module EPUB
|
|
|
76
76
|
met = voc.gsub(/-/, '_')
|
|
77
77
|
attr_writer met
|
|
78
78
|
define_method met do
|
|
79
|
-
refiners.
|
|
79
|
+
refiners.find {|refiner| refiner.property == voc}
|
|
80
80
|
end
|
|
81
81
|
end
|
|
82
82
|
end
|
data/lib/epub/searcher.rb
CHANGED
|
@@ -1,3 +1,13 @@
|
|
|
1
1
|
require 'epub/searcher/result'
|
|
2
2
|
require 'epub/searcher/publication'
|
|
3
3
|
require 'epub/searcher/xhtml'
|
|
4
|
+
|
|
5
|
+
module EPUB
|
|
6
|
+
module Searcher
|
|
7
|
+
class << self
|
|
8
|
+
def search(epub, word, **options)
|
|
9
|
+
Publication.search(epub.package, word, options)
|
|
10
|
+
end
|
|
11
|
+
end
|
|
12
|
+
end
|
|
13
|
+
end
|
|
@@ -4,8 +4,9 @@ module EPUB
|
|
|
4
4
|
module Searcher
|
|
5
5
|
class Publication
|
|
6
6
|
class << self
|
|
7
|
-
|
|
8
|
-
|
|
7
|
+
# @todo Use named argument in the future
|
|
8
|
+
def search(package, word, **options)
|
|
9
|
+
new(word).search(package, options)
|
|
9
10
|
end
|
|
10
11
|
end
|
|
11
12
|
|
|
@@ -13,14 +14,15 @@ module EPUB
|
|
|
13
14
|
@word = word
|
|
14
15
|
end
|
|
15
16
|
|
|
16
|
-
|
|
17
|
+
# @todo Use named argument in the future
|
|
18
|
+
def search(package, algorithm: :seamless)
|
|
17
19
|
results = []
|
|
18
20
|
|
|
19
21
|
spine = package.spine
|
|
20
22
|
spine_step = Result::Step.new(:element, 2, {:name => 'spine', :id => spine.id})
|
|
21
23
|
spine.each_itemref.with_index do |itemref, index|
|
|
22
24
|
itemref_step = Result::Step.new(:itemref, index, {:id => itemref.id})
|
|
23
|
-
XHTML::
|
|
25
|
+
XHTML::ALGORITHMS[algorithm].search(Nokogiri.XML(itemref.item.read), @word).each do |sub_result|
|
|
24
26
|
results << Result.new([spine_step, itemref_step] + sub_result.parent_steps, sub_result.start_steps, sub_result.end_steps)
|
|
25
27
|
end
|
|
26
28
|
end
|
data/lib/epub/searcher/result.rb
CHANGED
|
@@ -1,6 +1,37 @@
|
|
|
1
1
|
module EPUB
|
|
2
2
|
module Searcher
|
|
3
3
|
class Result
|
|
4
|
+
class << self
|
|
5
|
+
# @example
|
|
6
|
+
# Result.aggregate_step_intersection([a, b, c], [a, b, d]) # => [[a, b], [c], [d]]
|
|
7
|
+
# @example
|
|
8
|
+
# Result.aggregate_step_intersection([a, b, c], [a, d, c]) # => [[a], [b, c], [d, c]]
|
|
9
|
+
# # Note that c here is not included in the first element of returned value.
|
|
10
|
+
# @param steps1 [Array<Step>, Array<Array>]
|
|
11
|
+
# @param steps2 [Array<Step>, Array<Array>]
|
|
12
|
+
# @return [Array<Array<Array>>] Thee arrays:
|
|
13
|
+
# 1. "intersection" of +steps1+ and +steps2+. "intersection" here is not the term of mathmatics
|
|
14
|
+
# 2. remaining steps of +steps1+
|
|
15
|
+
# 3. remaining steps of +steps2+
|
|
16
|
+
def aggregate_step_intersection(steps1, steps2)
|
|
17
|
+
intersection = []
|
|
18
|
+
steps1_remaining = []
|
|
19
|
+
steps2_remaining = []
|
|
20
|
+
broken = false
|
|
21
|
+
steps1.zip steps2 do |step1, step2|
|
|
22
|
+
broken = true unless step1 && step2 && step1 == step2
|
|
23
|
+
if broken
|
|
24
|
+
steps1_remaining << step1 unless step1.nil?
|
|
25
|
+
steps2_remaining << step2 unless step2.nil?
|
|
26
|
+
else
|
|
27
|
+
intersection << step1
|
|
28
|
+
end
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
[intersection, steps1_remaining, steps2_remaining]
|
|
32
|
+
end
|
|
33
|
+
end
|
|
34
|
+
|
|
4
35
|
attr_reader :parent_steps, :start_steps, :end_steps
|
|
5
36
|
|
|
6
37
|
# @param parent_steps [Array<Step>] common steps between start and end
|
data/lib/epub/searcher/xhtml.rb
CHANGED
|
@@ -4,36 +4,38 @@ require 'epub/parser/utils'
|
|
|
4
4
|
module EPUB
|
|
5
5
|
module Searcher
|
|
6
6
|
class XHTML
|
|
7
|
-
|
|
8
|
-
class << self
|
|
9
|
-
# @param element [Nokogiri::XML::Element, Nokogiri::XML::Document]
|
|
10
|
-
# @param word [String]
|
|
11
|
-
# @return [Array<Result>]
|
|
12
|
-
def search(element, word)
|
|
13
|
-
new(word).search(element.respond_to?(:root) ? element.root : element)
|
|
14
|
-
end
|
|
15
|
-
end
|
|
7
|
+
ALGORITHMS = {}
|
|
16
8
|
|
|
9
|
+
class << self
|
|
10
|
+
# @param element [Nokogiri::XML::Element, Nokogiri::XML::Document]
|
|
17
11
|
# @param word [String]
|
|
18
|
-
|
|
19
|
-
|
|
12
|
+
# @return [Array<Result>]
|
|
13
|
+
def search(element, word)
|
|
14
|
+
new(element.respond_to?(:root) ? element.root : element).search(word)
|
|
20
15
|
end
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
# @param word [String]
|
|
19
|
+
def initialize(element)
|
|
20
|
+
@element = element
|
|
21
|
+
end
|
|
21
22
|
|
|
23
|
+
class Restricted < self
|
|
22
24
|
# @param element [Nokogiri::XML::Element]
|
|
23
25
|
# @return [Array<Result>]
|
|
24
|
-
def search(element)
|
|
26
|
+
def search(word, element=nil)
|
|
25
27
|
results = []
|
|
26
28
|
|
|
27
29
|
elem_index = 0
|
|
28
|
-
element.children.each do |child|
|
|
30
|
+
(element || @element).children.each do |child|
|
|
29
31
|
if child.element?
|
|
30
32
|
child_step = Result::Step.new(:element, elem_index, {:name => child.name, :id => Parser::Utils.extract_attribute(child, 'id')})
|
|
31
33
|
if child.name == 'img'
|
|
32
|
-
if Parser::Utils.extract_attribute(child, 'alt').index(
|
|
34
|
+
if Parser::Utils.extract_attribute(child, 'alt').index(word)
|
|
33
35
|
results << Result.new([child_step], nil, nil)
|
|
34
36
|
end
|
|
35
37
|
else
|
|
36
|
-
search(child).each do |sub_result|
|
|
38
|
+
search(word, child).each do |sub_result|
|
|
37
39
|
results << Result.new([child_step] + sub_result.parent_steps, sub_result.start_steps, sub_result.end_steps)
|
|
38
40
|
end
|
|
39
41
|
end
|
|
@@ -42,8 +44,8 @@ module EPUB
|
|
|
42
44
|
text_index = elem_index
|
|
43
45
|
char_index = 0
|
|
44
46
|
text_step = Result::Step.new(:text, text_index)
|
|
45
|
-
while char_index = child.text.index(
|
|
46
|
-
results << Result.new([text_step], [Result::Step.new(:character, char_index)], [Result::Step.new(:character, char_index +
|
|
47
|
+
while char_index = child.text.index(word, char_index)
|
|
48
|
+
results << Result.new([text_step], [Result::Step.new(:character, char_index)], [Result::Step.new(:character, char_index + word.length)])
|
|
47
49
|
char_index += 1
|
|
48
50
|
end
|
|
49
51
|
end
|
|
@@ -52,6 +54,100 @@ module EPUB
|
|
|
52
54
|
results
|
|
53
55
|
end
|
|
54
56
|
end
|
|
57
|
+
ALGORITHMS[:restricted] = Restricted
|
|
58
|
+
|
|
59
|
+
class Seamless < self
|
|
60
|
+
def search(word)
|
|
61
|
+
unless @indices
|
|
62
|
+
@indices, @content = build_indices(@element)
|
|
63
|
+
end
|
|
64
|
+
visit(@indices, @content, word)
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
def build_indices(element)
|
|
68
|
+
indices = {}
|
|
69
|
+
content = ''
|
|
70
|
+
|
|
71
|
+
elem_index = 0
|
|
72
|
+
element.children.each do |child|
|
|
73
|
+
if child.element?
|
|
74
|
+
child_step = [:element, elem_index, {:name => child.name, :id => Parser::Utils.extract_attribute(child, 'id')}]
|
|
75
|
+
elem_index += 1
|
|
76
|
+
if child.name == 'img'
|
|
77
|
+
alt = Parser::Utils.extract_attribute(child, 'alt')
|
|
78
|
+
next if alt.nil? || alt.empty?
|
|
79
|
+
indices[content.length] = [child_step]
|
|
80
|
+
content << alt
|
|
81
|
+
else
|
|
82
|
+
# TODO: Consider block level elements
|
|
83
|
+
content_length = content.length
|
|
84
|
+
sub_indices, sub_content = build_indices(child)
|
|
85
|
+
sub_indices.each_pair do |sub_pos, child_steps|
|
|
86
|
+
indices[content_length + sub_pos] = [child_step] + child_steps
|
|
87
|
+
end
|
|
88
|
+
content << sub_content
|
|
89
|
+
end
|
|
90
|
+
elsif child.text? || child.cdata?
|
|
91
|
+
text_index = elem_index
|
|
92
|
+
text_step = [:text, text_index]
|
|
93
|
+
indices[content.length] = [text_step]
|
|
94
|
+
content << child.content
|
|
95
|
+
end
|
|
96
|
+
end
|
|
97
|
+
|
|
98
|
+
[indices, content]
|
|
99
|
+
end
|
|
100
|
+
|
|
101
|
+
private
|
|
102
|
+
|
|
103
|
+
def visit(indices, content, word)
|
|
104
|
+
results = []
|
|
105
|
+
offsets = indices.keys
|
|
106
|
+
i = 0
|
|
107
|
+
while i = content.index(word, i)
|
|
108
|
+
offset = find_offset(offsets, i)
|
|
109
|
+
start_steps = to_result_steps(indices[offset])
|
|
110
|
+
last_step = start_steps.last
|
|
111
|
+
if last_step.info[:name] == 'img'
|
|
112
|
+
parent_steps = start_steps
|
|
113
|
+
start_steps = end_steps = nil
|
|
114
|
+
else
|
|
115
|
+
word_length = word.length
|
|
116
|
+
start_char_step = Result::Step.new(:character, i - offset)
|
|
117
|
+
end_offset = find_offset(offsets, i + word_length, true)
|
|
118
|
+
end_steps = to_result_steps(indices[end_offset])
|
|
119
|
+
end_char_step = Result::Step.new(:character, i + word_length - end_offset)
|
|
120
|
+
parent_steps, start_steps, end_steps = Result.aggregate_step_intersection(start_steps, end_steps)
|
|
121
|
+
start_steps << start_char_step
|
|
122
|
+
end_steps << end_char_step
|
|
123
|
+
end
|
|
124
|
+
results << Result.new(parent_steps, start_steps, end_steps)
|
|
125
|
+
i += 1
|
|
126
|
+
end
|
|
127
|
+
|
|
128
|
+
results
|
|
129
|
+
end
|
|
130
|
+
|
|
131
|
+
# Find max offset greater than or equal to index
|
|
132
|
+
# @param offsets [Array<Integer>] keys of indices
|
|
133
|
+
# @param index [Integer] position of search word in content string
|
|
134
|
+
# @todo: more efficient algorithm
|
|
135
|
+
def find_offset(offsets, index, for_end_position=false)
|
|
136
|
+
comparison_operator = for_end_position ? :< : :<=
|
|
137
|
+
l = offsets.length
|
|
138
|
+
offset_index = (0..l).bsearch {|i|
|
|
139
|
+
o = offsets[l - i]
|
|
140
|
+
next false unless o
|
|
141
|
+
o.send(comparison_operator, index)
|
|
142
|
+
}
|
|
143
|
+
offsets[l - offset_index]
|
|
144
|
+
end
|
|
145
|
+
|
|
146
|
+
def to_result_steps(steps)
|
|
147
|
+
steps.map {|step| Result::Step.new(*step)}
|
|
148
|
+
end
|
|
149
|
+
end
|
|
150
|
+
ALGORITHMS[:seamless] = Seamless
|
|
55
151
|
end
|
|
56
152
|
end
|
|
57
153
|
end
|
|
@@ -52,6 +52,27 @@ class TestContentDocument < Test::Unit::TestCase
|
|
|
52
52
|
end
|
|
53
53
|
|
|
54
54
|
class TestNavigationDocument < self
|
|
55
|
+
def test_toc_returns_nav_with_type_toc
|
|
56
|
+
navigation = Navigation.new
|
|
57
|
+
toc = Navigation::Navigation.new.tap {|nav| nav.type = 'toc'}
|
|
58
|
+
navigation.navigations << toc
|
|
59
|
+
|
|
60
|
+
assert_same toc, navigation.toc
|
|
61
|
+
end
|
|
62
|
+
|
|
63
|
+
def test_contents_returns_items_of_toc
|
|
64
|
+
manifest = EPUB::Publication::Package::Manifest.new
|
|
65
|
+
item = EPUB::Publication::Package::Manifest::Item.new
|
|
66
|
+
item.media_type = 'application/xhtml+xml'
|
|
67
|
+
item.properties = %w[nav]
|
|
68
|
+
item.href = Addressable::URI.parse('nav.xhtml')
|
|
69
|
+
stub(item).read {File.read(File.expand_path('../fixtures/book/OPS/nav.xhtml', __FILE__))}
|
|
70
|
+
manifest << item
|
|
71
|
+
nav_doc = EPUB::Parser::ContentDocument.new(item).parse
|
|
72
|
+
|
|
73
|
+
assert_equal ['Table of Contents', '一ページ目', '二ページ目', '第一節', '第二節', '第三節', '第四節'], nav_doc.contents.collect(&:text)
|
|
74
|
+
end
|
|
75
|
+
|
|
55
76
|
def test_item_hidden_returns_true_when_it_has_some_value
|
|
56
77
|
item = Navigation::Item.new.tap {|item| item.hidden = ''}
|
|
57
78
|
assert_true item.hidden?
|
data/test/test_inspect.rb
CHANGED
|
@@ -45,7 +45,7 @@ class TestInspect < Test::Unit::TestCase
|
|
|
45
45
|
title.content = 'Book Title'
|
|
46
46
|
@metadata.titles << title
|
|
47
47
|
|
|
48
|
-
title_pattern =
|
|
48
|
+
title_pattern = '@dc_titles=[#<EPUB::Publication::Package::Metadata::Title'
|
|
49
49
|
|
|
50
50
|
assert_match title_pattern, @metadata.inspect
|
|
51
51
|
end
|
data/test/test_publication.rb
CHANGED
|
@@ -23,14 +23,14 @@ class TestPublication < Test::Unit::TestCase
|
|
|
23
23
|
refiner = Package::Metadata::Meta.new
|
|
24
24
|
refinee = Package::Metadata::Meta.new
|
|
25
25
|
refiner.refines = refinee
|
|
26
|
-
assert_same refinee.refiners.first, refiner
|
|
26
|
+
assert_same refinee.refiners.first, refiner
|
|
27
27
|
end
|
|
28
28
|
|
|
29
29
|
def test_link_refines_setter_connect_refinee_to_the_link
|
|
30
30
|
refiner = Package::Metadata::Link.new
|
|
31
31
|
refinee = Package::Metadata::Meta.new
|
|
32
32
|
refiner.refines = refinee
|
|
33
|
-
assert_same refinee.refiners.first, refiner
|
|
33
|
+
assert_same refinee.refiners.first, refiner
|
|
34
34
|
end
|
|
35
35
|
|
|
36
36
|
def test_title_returns_extended_title_when_it_exists
|
|
@@ -184,6 +184,59 @@ class TestPublication < Test::Unit::TestCase
|
|
|
184
184
|
class TestManifest < TestPublication
|
|
185
185
|
include EPUB::Publication
|
|
186
186
|
|
|
187
|
+
def setup
|
|
188
|
+
@manifest = EPUB::Publication::Package::Manifest.new
|
|
189
|
+
@nav1 = EPUB::Publication::Package::Manifest::Item.new
|
|
190
|
+
@nav1.id = 'nav1'
|
|
191
|
+
@nav1.properties = %w[nav]
|
|
192
|
+
@nav2 = EPUB::Publication::Package::Manifest::Item.new
|
|
193
|
+
@nav2.id = 'nav2'
|
|
194
|
+
@nav2.properties = %w[nav]
|
|
195
|
+
@item = EPUB::Publication::Package::Manifest::Item.new
|
|
196
|
+
@item.id = 'item'
|
|
197
|
+
@cover_image = EPUB::Publication::Package::Manifest::Item.new
|
|
198
|
+
@cover_image.id = 'cover-image'
|
|
199
|
+
@cover_image.properties = %w[cover-image]
|
|
200
|
+
@manifest << @nav1 << @item << @nav2 << @cover_image
|
|
201
|
+
end
|
|
202
|
+
|
|
203
|
+
def test_each_item_returns_enumerator_when_no_block_given
|
|
204
|
+
assert_instance_of Enumerator, @manifest.each_item
|
|
205
|
+
end
|
|
206
|
+
|
|
207
|
+
def test_each_nav_iterates_over_items_with_nav_property
|
|
208
|
+
navs = [@nav1, @nav2]
|
|
209
|
+
i = 0
|
|
210
|
+
@manifest.each_nav do |nav|
|
|
211
|
+
assert_same navs[i], nav
|
|
212
|
+
i += 1
|
|
213
|
+
end
|
|
214
|
+
end
|
|
215
|
+
|
|
216
|
+
def test_each_nav_returns_iterable_object_when_no_block_given
|
|
217
|
+
navs = [@nav1, @nav2]
|
|
218
|
+
|
|
219
|
+
assert_respond_to @manifest.each_nav, :each
|
|
220
|
+
@manifest.each_nav.with_index do |nav, i|
|
|
221
|
+
assert_same navs[i], nav
|
|
222
|
+
end
|
|
223
|
+
end
|
|
224
|
+
|
|
225
|
+
def test_navs_iterates_over_items_with_nav_property
|
|
226
|
+
navs = [@nav1, @nav2]
|
|
227
|
+
@manifest.navs.each_with_index do |nav, i|
|
|
228
|
+
assert_same navs[i], nav
|
|
229
|
+
end
|
|
230
|
+
end
|
|
231
|
+
|
|
232
|
+
def test_nav_returns_first_item_with_nav_property
|
|
233
|
+
assert_same @nav1, @manifest.nav
|
|
234
|
+
end
|
|
235
|
+
|
|
236
|
+
def test_cover_image_returns_item_with_cover_image_property
|
|
237
|
+
assert_same @cover_image, @manifest.cover_image
|
|
238
|
+
end
|
|
239
|
+
|
|
187
240
|
class TestItem < TestManifest
|
|
188
241
|
def test_content_document_returns_nil_when_not_xhtml_nor_svg
|
|
189
242
|
item = EPUB::Publication::Package::Manifest::Item.new
|
data/test/test_searcher.rb
CHANGED
|
@@ -46,36 +46,61 @@ class TestSearcher < Test::Unit::TestCase
|
|
|
46
46
|
@nav = @doc.search('nav').first
|
|
47
47
|
end
|
|
48
48
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
49
|
+
module TestSearch
|
|
50
|
+
def test_no_result
|
|
51
|
+
assert_empty @searcher.search(@h1, 'no result')
|
|
52
|
+
end
|
|
52
53
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
54
|
+
def test_simple
|
|
55
|
+
assert_equal results([[[[:text, 0]], [[:character, 9]], [[:character, 16]]]]), @searcher.search(@h1, 'Content')
|
|
56
|
+
end
|
|
56
57
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
58
|
+
def test_multiple_text_result
|
|
59
|
+
assert_equal results([[[[:text, 0]], [[:character, 6]], [[:character, 7]]], [[[:text, 0]], [[:character, 10]], [[:character, 11]]]]), @searcher.search(@h1, 'o')
|
|
60
|
+
end
|
|
60
61
|
|
|
61
|
-
|
|
62
|
-
|
|
62
|
+
def test_text_after_element
|
|
63
|
+
elem = Nokogiri.XML('<root><elem>inner</elem>after</root>')
|
|
63
64
|
|
|
64
|
-
|
|
65
|
-
|
|
65
|
+
assert_equal results([[[[:text, 1]], [[:character, 0]], [[:character, 5]]]]), @searcher.search(elem, 'after')
|
|
66
|
+
end
|
|
66
67
|
|
|
67
|
-
|
|
68
|
-
|
|
68
|
+
def test_entity_reference
|
|
69
|
+
elem = Nokogiri.XML('<root>before<after</root>')
|
|
69
70
|
|
|
70
|
-
|
|
71
|
+
assert_equal results([[[[:text, 0]], [[:character, 6]], [[:character, 7]]]]), @searcher.search(elem, '<')
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
def test_nested_result
|
|
75
|
+
assert_equal results([[[[:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 0, {:name => 'a', :id => nil}], [:text, 0]], [[:character, 0]], [[:character, 3]]]]), @searcher.search(@nav, '第二節')
|
|
76
|
+
end
|
|
77
|
+
|
|
78
|
+
def test_img
|
|
79
|
+
assert_equal [result([[[:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 1, {:name => 'ol', :id => nil}], [:element, 2, {:name => 'li', :id => nil}], [:element, 0, {:name => 'a', :id => nil}], [:element, 0, {:name => 'img', :id => nil}]], nil, nil])], @searcher.search(@nav, '第三節')
|
|
80
|
+
end
|
|
71
81
|
end
|
|
72
82
|
|
|
73
|
-
|
|
74
|
-
|
|
83
|
+
class TestRestricted < self
|
|
84
|
+
include TestSearch
|
|
85
|
+
|
|
86
|
+
def setup
|
|
87
|
+
super
|
|
88
|
+
@searcher = EPUB::Searcher::XHTML::Restricted
|
|
89
|
+
end
|
|
75
90
|
end
|
|
76
91
|
|
|
77
|
-
|
|
78
|
-
|
|
92
|
+
class TestSeamless < self
|
|
93
|
+
include TestSearch
|
|
94
|
+
|
|
95
|
+
def setup
|
|
96
|
+
super
|
|
97
|
+
@searcher = EPUB::Searcher::XHTML::Seamless
|
|
98
|
+
end
|
|
99
|
+
|
|
100
|
+
def test_seamless
|
|
101
|
+
elem = Nokogiri.XML('<root>This <em>includes</em> a child element.</root>')
|
|
102
|
+
assert_equal results([[[], [[:text, 0], [:character, 0]], [[:text, 1], [:character, 17]]]]), @searcher.search(elem, 'This includes a child element.')
|
|
103
|
+
end
|
|
79
104
|
end
|
|
80
105
|
|
|
81
106
|
class TestResult < self
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: epub-parser
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1.
|
|
4
|
+
version: 0.1.9
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- KITAITI Makoto
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date:
|
|
11
|
+
date: 2015-06-09 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: rake
|
|
@@ -179,21 +179,7 @@ dependencies:
|
|
|
179
179
|
- !ruby/object:Gem::Version
|
|
180
180
|
version: '0'
|
|
181
181
|
- !ruby/object:Gem::Dependency
|
|
182
|
-
name:
|
|
183
|
-
requirement: !ruby/object:Gem::Requirement
|
|
184
|
-
requirements:
|
|
185
|
-
- - ">="
|
|
186
|
-
- !ruby/object:Gem::Version
|
|
187
|
-
version: '0'
|
|
188
|
-
type: :development
|
|
189
|
-
prerelease: false
|
|
190
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
191
|
-
requirements:
|
|
192
|
-
- - ">="
|
|
193
|
-
- !ruby/object:Gem::Version
|
|
194
|
-
version: '0'
|
|
195
|
-
- !ruby/object:Gem::Dependency
|
|
196
|
-
name: epub_validator
|
|
182
|
+
name: aruba
|
|
197
183
|
requirement: !ruby/object:Gem::Requirement
|
|
198
184
|
requirements:
|
|
199
185
|
- - ">="
|
|
@@ -207,13 +193,13 @@ dependencies:
|
|
|
207
193
|
- !ruby/object:Gem::Version
|
|
208
194
|
version: '0'
|
|
209
195
|
- !ruby/object:Gem::Dependency
|
|
210
|
-
name:
|
|
196
|
+
name: zipruby
|
|
211
197
|
requirement: !ruby/object:Gem::Requirement
|
|
212
198
|
requirements:
|
|
213
199
|
- - ">="
|
|
214
200
|
- !ruby/object:Gem::Version
|
|
215
201
|
version: '0'
|
|
216
|
-
type: :
|
|
202
|
+
type: :runtime
|
|
217
203
|
prerelease: false
|
|
218
204
|
version_requirements: !ruby/object:Gem::Requirement
|
|
219
205
|
requirements:
|
|
@@ -221,21 +207,21 @@ dependencies:
|
|
|
221
207
|
- !ruby/object:Gem::Version
|
|
222
208
|
version: '0'
|
|
223
209
|
- !ruby/object:Gem::Dependency
|
|
224
|
-
name:
|
|
210
|
+
name: nokogiri
|
|
225
211
|
requirement: !ruby/object:Gem::Requirement
|
|
226
212
|
requirements:
|
|
227
|
-
- - "
|
|
213
|
+
- - "~>"
|
|
228
214
|
- !ruby/object:Gem::Version
|
|
229
|
-
version: '
|
|
215
|
+
version: '1.6'
|
|
230
216
|
type: :runtime
|
|
231
217
|
prerelease: false
|
|
232
218
|
version_requirements: !ruby/object:Gem::Requirement
|
|
233
219
|
requirements:
|
|
234
|
-
- - "
|
|
220
|
+
- - "~>"
|
|
235
221
|
- !ruby/object:Gem::Version
|
|
236
|
-
version: '
|
|
222
|
+
version: '1.6'
|
|
237
223
|
- !ruby/object:Gem::Dependency
|
|
238
|
-
name:
|
|
224
|
+
name: nokogumbo
|
|
239
225
|
requirement: !ruby/object:Gem::Requirement
|
|
240
226
|
requirements:
|
|
241
227
|
- - ">="
|
|
@@ -248,20 +234,6 @@ dependencies:
|
|
|
248
234
|
- - ">="
|
|
249
235
|
- !ruby/object:Gem::Version
|
|
250
236
|
version: '0'
|
|
251
|
-
- !ruby/object:Gem::Dependency
|
|
252
|
-
name: nokogiri
|
|
253
|
-
requirement: !ruby/object:Gem::Requirement
|
|
254
|
-
requirements:
|
|
255
|
-
- - "~>"
|
|
256
|
-
- !ruby/object:Gem::Version
|
|
257
|
-
version: '1.6'
|
|
258
|
-
type: :runtime
|
|
259
|
-
prerelease: false
|
|
260
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
261
|
-
requirements:
|
|
262
|
-
- - "~>"
|
|
263
|
-
- !ruby/object:Gem::Version
|
|
264
|
-
version: '1.6'
|
|
265
237
|
- !ruby/object:Gem::Dependency
|
|
266
238
|
name: addressable
|
|
267
239
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -280,16 +252,16 @@ dependencies:
|
|
|
280
252
|
name: rchardet
|
|
281
253
|
requirement: !ruby/object:Gem::Requirement
|
|
282
254
|
requirements:
|
|
283
|
-
- - "
|
|
255
|
+
- - "<"
|
|
284
256
|
- !ruby/object:Gem::Version
|
|
285
|
-
version: '
|
|
257
|
+
version: '1.6'
|
|
286
258
|
type: :runtime
|
|
287
259
|
prerelease: false
|
|
288
260
|
version_requirements: !ruby/object:Gem::Requirement
|
|
289
261
|
requirements:
|
|
290
|
-
- - "
|
|
262
|
+
- - "<"
|
|
291
263
|
- !ruby/object:Gem::Version
|
|
292
|
-
version: '
|
|
264
|
+
version: '1.6'
|
|
293
265
|
description: Parse EPUB 3 book loosely
|
|
294
266
|
email:
|
|
295
267
|
- KitaitiMakoto@gmail.com
|
|
@@ -395,9 +367,9 @@ require_paths:
|
|
|
395
367
|
- lib
|
|
396
368
|
required_ruby_version: !ruby/object:Gem::Requirement
|
|
397
369
|
requirements:
|
|
398
|
-
- - "
|
|
370
|
+
- - ">"
|
|
399
371
|
- !ruby/object:Gem::Version
|
|
400
|
-
version: '
|
|
372
|
+
version: '2'
|
|
401
373
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
402
374
|
requirements:
|
|
403
375
|
- - ">="
|
|
@@ -405,7 +377,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
405
377
|
version: '0'
|
|
406
378
|
requirements: []
|
|
407
379
|
rubyforge_project:
|
|
408
|
-
rubygems_version: 2.
|
|
380
|
+
rubygems_version: 2.4.6
|
|
409
381
|
signing_key:
|
|
410
382
|
specification_version: 4
|
|
411
383
|
summary: EPUB 3 Parser
|