epub-parser 0.3.6 → 0.3.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitlab-ci.yml +51 -1
- data/.yardopts +5 -3
- data/{CHANGELOG.markdown → CHANGELOG.adoc} +49 -84
- data/README.adoc +228 -0
- data/Rakefile +3 -1
- data/bin/epub-cover +51 -0
- data/docs/EpubCover.adoc +46 -0
- data/docs/Examples.adoc +9 -0
- data/docs/Home.adoc +224 -0
- data/docs/Searcher.adoc +132 -0
- data/epub-parser.gemspec +2 -1
- data/lib/epub/book/features.rb +7 -1
- data/lib/epub/metadata.rb +9 -1
- data/lib/epub/parser/metadata.rb +4 -2
- data/lib/epub/parser/version.rb +1 -1
- data/lib/epub/publication/package/manifest.rb +1 -1
- data/lib/epub/searcher/xhtml.rb +1 -0
- data/test/helper.rb +1 -1
- metadata +26 -8
- data/README.markdown +0 -219
- data/docs/Home.markdown +0 -196
- data/docs/Searcher.markdown +0 -109
data/docs/Searcher.markdown
DELETED
@@ -1,109 +0,0 @@
|
|
1
|
-
{file:docs/Home.markdown} > **{file:docs/Searcher.markdown}**
|
2
|
-
|
3
|
-
Searcher
|
4
|
-
========
|
5
|
-
|
6
|
-
*Searcher is experimental now. Note that all interfaces are not stable at all.*
|
7
|
-
|
8
|
-
Example
|
9
|
-
-------
|
10
|
-
|
11
|
-
epub = EPUB::Parser.parse('childrens-literature.epub')
|
12
|
-
search_word = 'INTRODUCTORY'
|
13
|
-
results = EPUB::Searcher.search_text(epub, search_word)
|
14
|
-
# => [#<EPUB::Searcher::Result:0x007f80ccde9528
|
15
|
-
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f80ccde9730 @index=12, @info={}, @type=:character>],
|
16
|
-
# @parent_steps=
|
17
|
-
# [#<EPUB::Searcher::Result::Step:0x007f80ccf571d0 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
|
18
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccf3d3e8 @index=1, @info={:id=>nil}, @type=:itemref>,
|
19
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9e88 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
|
20
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9e38 @index=0, @info={:name=>"nav", :id=>"toc"}, @type=:element>,
|
21
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9de8 @index=1, @info={:name=>"ol", :id=>"tocList"}, @type=:element>,
|
22
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9d98 @index=0, @info={:name=>"li", :id=>"np-313"}, @type=:element>,
|
23
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9d48 @index=1, @info={:name=>"ol", :id=>nil}, @type=:element>,
|
24
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9ca8 @index=1, @info={:name=>"li", :id=>"np-317"}, @type=:element>,
|
25
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9c08 @index=0, @info={:name=>"a", :id=>nil}, @type=:element>,
|
26
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde9bb8 @index=0, @info={}, @type=:text>],
|
27
|
-
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f80ccde9af0 @index=0, @info={}, @type=:character>]>,
|
28
|
-
# #<EPUB::Searcher::Result:0x007f80ccebcb30
|
29
|
-
# @end_steps=[#<EPUB::Searcher::Result::Step:0x007f80ccebcdb0 @index=12, @info={}, @type=:character>],
|
30
|
-
# @parent_steps=
|
31
|
-
# [#<EPUB::Searcher::Result::Step:0x007f80ccf571d0 @index=2, @info={:name=>"spine", :id=>nil}, @type=:element>,
|
32
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccde94b0 @index=2, @info={:id=>nil}, @type=:itemref>,
|
33
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccebd328 @index=1, @info={:name=>"body", :id=>nil}, @type=:element>,
|
34
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccebd2d8 @index=0, @info={:name=>"section", :id=>"pgepubid00492"}, @type=:element>,
|
35
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccebd260 @index=3, @info={:name=>"section", :id=>"pgepubid00498"}, @type=:element>,
|
36
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccebd210 @index=1, @info={:name=>"h3", :id=>nil}, @type=:element>,
|
37
|
-
# ##<EPUB::Searcher::Result::Step:0x007f80ccebd198 @index=0, @info={}, @type=:text>],
|
38
|
-
# @start_steps=[#<EPUB::Searcher::Result::Step:0x007f80ccebd0d0 @index=0, @info={}, @type=:character>]>]
|
39
|
-
puts results.collect(&:to_cfi).collect(&:to_fragment)
|
40
|
-
# epubcfi(/6/4!/4/2[toc]/4[tocList]/2[np-313]/4/4[np-317]/2/1,:0,:12)
|
41
|
-
# epubcfi(/6/6!/4/2[pgepubid00492]/8[pgepubid00498]/4/1,:0,:12)
|
42
|
-
# => nil
|
43
|
-
|
44
|
-
Search result
|
45
|
-
-------------
|
46
|
-
|
47
|
-
Search result is an array of {EPUB::Searcher::Result} and it may be converted to an EPUBCFI string by {EPUB::Searcher::Result#to_cfi_s}.
|
48
|
-
|
49
|
-
Seamless XHTML Searcher
|
50
|
-
-----------------------
|
51
|
-
|
52
|
-
Now default searcher for XHTML is *seamless* searcher, which ignores tags when searching.
|
53
|
-
|
54
|
-
You can search words 'search word' from XHTML document below:
|
55
|
-
|
56
|
-
<html>
|
57
|
-
<head>
|
58
|
-
<title>Sample document</title>
|
59
|
-
</head>
|
60
|
-
<body>
|
61
|
-
<p><em>search</em> word</p>
|
62
|
-
</body>
|
63
|
-
</html>
|
64
|
-
|
65
|
-
Restricted XHTML Searcher
|
66
|
-
-------------------------
|
67
|
-
|
68
|
-
You can also use *restricted* searcher, which means that it can search from only single elements. For instance, it can find 'search word' from XHTML document below:
|
69
|
-
|
70
|
-
<html>
|
71
|
-
<head>
|
72
|
-
<title>Sample document</title>
|
73
|
-
</head>
|
74
|
-
<body>
|
75
|
-
<p>search word</p>
|
76
|
-
</body>
|
77
|
-
</html>
|
78
|
-
|
79
|
-
But cannot from document below:
|
80
|
-
|
81
|
-
<html>
|
82
|
-
<head>
|
83
|
-
<title>Sample document</title>
|
84
|
-
</head>
|
85
|
-
<body>
|
86
|
-
<p><em>search</em> word</p>
|
87
|
-
</body>
|
88
|
-
</html>
|
89
|
-
|
90
|
-
because the words 'search' and 'word' are not in the same element.
|
91
|
-
|
92
|
-
To use restricted searcher, specify `algorithm` option for `search` method:
|
93
|
-
|
94
|
-
results = EPUB::Searcher.search_text(epub, search_word, algorithm: :restricted)
|
95
|
-
|
96
|
-
Element Searcher
|
97
|
-
----------------
|
98
|
-
|
99
|
-
You can search XHTML elements by CSS selector or XPath.
|
100
|
-
|
101
|
-
EPUB::Searcher::Publication.search_element(@package, css: 'ol > li').collect {|result| result[:location]}.map(&:to_fragment)
|
102
|
-
# => ["epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313])",
|
103
|
-
# "epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313]/4/2[np-315])",
|
104
|
-
# "epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313]/4/4[np-317])",
|
105
|
-
# "epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313]/4/6)",
|
106
|
-
# "epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313]/4/6/4/2[np-319])",
|
107
|
-
# "epubcfi(/4/4!/4/2[toc]/4[tocList]/2[np-313]/4/6/4/2[np-319]/4/2)",
|
108
|
-
# :
|
109
|
-
# :
|