RubyGems - epub-parser - Versions diffs - 0.1.6 → 0.1.7 - Mend

epub-parser 0.1.6 → 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/.yardopts +1 -0
data/CHANGELOG.markdown +7 -0
data/Gemfile +1 -1
data/README.markdown +7 -8
data/docs/Searcher.markdown +74 -0
data/epub-parser.gemspec +1 -0
data/lib/epub/parser/version.rb +1 -1
data/lib/epub/publication/package/manifest.rb +14 -1
data/lib/epub/searcher.rb +3 -0
data/lib/epub/searcher/publication.rb +32 -0
data/lib/epub/searcher/result.rb +73 -0
data/lib/epub/searcher/xhtml.rb +57 -0
data/test/fixtures/book/OPS/japanese.eucjp.xhtml +10 -0
data/test/fixtures/book/OPS/japanese.sjis.xhtml +10 -0
data/test/fixtures/book/OPS/japanese.utf8.xhtml +10 -0
data/test/fixtures/book/OPS//343/203/253/343/203/274/343/203/210/343/203/225/343/202/241/343/202/244/343/203/253.opf +10 -1
data/test/test_parser_publication.rb +2 -2
data/test/test_publication.rb +10 -0
data/test/test_searcher.rb +117 -0
metadata +27 -3

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 65f7f0d3749c5bf1d34ab527c94ac0905a3fa30f
-  data.tar.gz: d88e97de68e8b5d81c27d5921ce2fbe3ebd99e7f
+  metadata.gz: 02abe0846123f8f3d218581a102f69c2972a80e1
+  data.tar.gz: 510bddcca86f789add0ecc4193b424063a5d23d6
 SHA512:
-  metadata.gz: 1e2a88587ff96a480845cee7a99b624b4ee86d373ac659ace2f89b1cd597fed87156c9ffb487ee37b7a87f81a225f0f86b0f47c17d45c4150a70b594638afbdc
-  data.tar.gz: b7634902c0e838a66e4852c62d1356d8cfeb9b9ab84b371a8cba654057fd2c440f8b14e1a92e14311c42670c9557ac3f51e2638d118abdf330de9d750417d7a0
+  metadata.gz: 8c46c39e8031ab857e968dc3826e4ce0fb6d65588ef8d8e61f05672628feaaa58d25a829c1b39826cf1a04b458dcdcd2f0898578f92768bd9db866b4a5f1f6aa
+  data.tar.gz: feea20e63185013eea0c9d5bdd0b8c0e80a4e06646b8bd106095a9eb967a0c067113915ad70458d9332d3efab120875447c90032fa4f4dedf2465d72532f1d3e

data/.yardopts CHANGED

@@ -8,3 +8,4 @@ docs/FixedLayout.markdown
 docs/Epubinfo.markdown
 docs/EpubOpen.markdown
 docs/Navigation.markdown
+docs/Searcher.markdown

data/CHANGELOG.markdown CHANGED

@@ -1,5 +1,12 @@
 CHANGELOG
 =========
+0.1.7
+-----
+* [Experimental]Add `EPUB::Searcher` module. See {file:Searcher.markdown} for details
+* Detect and set character encoding in `EPUB::Publication::Package::Item#read`
 0.1.6
 -----
 * Remove `EPUB.parse` method

data/Gemfile CHANGED

@@ -1,2 +1,2 @@
-source "https://rubygems.org"
+source 'https://rubygems.org'
 gemspec

data/README.markdown CHANGED

@@ -93,6 +93,7 @@ See {file:docs/EpubOpen} for more info.
 REQUIREMENTS
 ------------
 * Ruby 1.9.3 or later
+* `patch` command to install Nokogiri
 * C compiler to compile Zip/Ruby and Nokogiri
 Related Gems
@@ -108,6 +109,12 @@ If you find other gems, please tell me or request a pull request.
 RECENT CHANGES
 --------------
+### 0.1.7
+* [Experimental]Add `EPUB::Searcher` module. See {file:Searcher.markdown} for details
+* Detect and set character encoding in `EPUB::Publication::Package::Item#read`
 ### 0.1.6
 * Remove `EPUB.parse` method
 * Remove `EPUB::Publication::Package::Metadata#to_hash`
@@ -134,14 +141,6 @@ RECENT CHANGES
 * Add `ContentDocument::XHTML#rexml` and `#nokogiri`
 * Inspect more readably
-### 0.1.4
-* [Fixed-Layout Documents][fixed-layout] support
-* Define `ContentDocument::XHTML#top_level?`
-* Define `Spine::Itemref#page_spread` and `#page_spread=`
-* Define some utility methods around `Manifest::Item` and `Spine::Itemref`
-[fixed-layout]: http://www.idpf.org/epub/fxl/
 See {file:CHANGELOG.markdown} for older changelogs and details.
 TODOS

data/docs/Searcher.markdown ADDED

@@ -0,0 +1,74 @@
+{file:docs/Home.markdown} > **{file:docs/Searcher.markdown}**
+Searcher
+========
+*Searcher is experimental now. Note that all interfaces are not stable at all.*
+Example
+-------
+    epub = EPUB::Parser.parse('childrens-literature-20130206.epub')
+    search_word = 'INTRODUCTORY'
+    results = EPUB::Searcher.search(epub.package, search_word)
+    # => [#<EPUB::Searcher::Result:0x007f74d2b31548
+    #   @end_steps=[#<EPUB::Searcher::Result::Step:0x007f74d2b7baa8 @index=12, @type=:character>],
+    #   @parent_steps=
+    #    [#<EPUB::Searcher::Result::Step:0x007f74d2b81318 @index=2, @name="spine", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7f4c8 @index=1, @type=:itemref>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7d560 @index=1, @name="body", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7d308 @index=0, @name="nav", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7cdb8 @index=1, @name="ol", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7cb38 @index=0, @name="li", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7c5e8 @index=1, @name="ol", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7bf80 @index=1, @name="li", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7bd28 @index=0, @name="a", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b7bb70 @index=0, @type=:text>],
+    #   @start_steps=[#<EPUB::Searcher::Result::Step:0x007f74d2b7baf8 @index=0, @type=:character>]>,
+    #  #<EPUB::Searcher::Result:0x007f74d294e258
+    #   @end_steps=[#<EPUB::Searcher::Result::Step:0x007f74d2b0f8d0 @index=12, @type=:character>],
+    #   @parent_steps=
+    #    [#<EPUB::Searcher::Result::Step:0x007f74d2b81318 @index=2, @name="spine", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b314f8 @index=2, @type=:itemref>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b2fb80 @index=1, @name="body", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b2f900 @index=0, @name="section", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b10578 @index=3, @name="section", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b0fb50 @index=1, @name="h3", @type=:element>,
+    #     # #<EPUB::Searcher::Result::Step:0x007f74d2b0f998 @index=0, @type=:text>],
+    #   @start_steps=[#<EPUB::Searcher::Result::Step:0x007f74d2b0f920 @index=0, @type=:character>]>]
+    puts results.collect(&:to_cfi_s)
+    # /6/4!/4/2/4/2/4/4/2/1,:0,:12
+    # /6/6!/4/2/8/4/1,:0,:12
+    # => nil
+Search result
+-------------
+Search result is an array of {EPUB::Searcher::Result} and it may be converted to an EPUBCFI string by {EPUB::Searcher::Result#to_cfi_s}.
+Restricted XHTML Searcher
+-------------------------
+Now searcher for XHTML documents is *restricted*, which means that it can search from only single elements. For instance, it can find 'search word' from XHTML document below:
+    <html>
+      <head>
+        <title>Sample document</title>
+      </head>
+      <body>
+        <p>search word</p>
+      </body>
+    </html>
+But cannot from document below:
+    <html>
+      <head>
+        <title>Sample document</title>
+      </head>
+      <body>
+        <p><em>search</em> word</p>
+      </body>
+    </html>
+because the words 'search' and 'word' are not in the same element.

data/epub-parser.gemspec CHANGED

@@ -46,4 +46,5 @@ Gem::Specification.new do |s|
   s.add_runtime_dependency 'zipruby'
   s.add_runtime_dependency 'nokogiri', '~> 1.6'
   s.add_runtime_dependency 'addressable', '>= 2.3.5'
+  s.add_runtime_dependency 'rchardet'
 end

data/lib/epub/parser/version.rb CHANGED

@@ -1,5 +1,5 @@
 module EPUB
   class Parser
-    VERSION = "0.1.6"
+    VERSION = "0.1.7"
   end
 end

data/lib/epub/publication/package/manifest.rb CHANGED

@@ -1,5 +1,6 @@
 require 'set'
 require 'enumerabler'
+require 'rchardet'
 require 'epub/constants'
 require 'epub/parser/content_document'
@@ -91,9 +92,21 @@ module EPUB
           end
           def read
-            Zip::Archive.open(manifest.package.book.epub_file) {|zip|
+            raw_content = Zip::Archive.open(manifest.package.book.epub_file) {|zip|
               zip.fopen(entry_name).read
             }
+            # CharDet.detect doesn't raise Encoding::CompatibilityError
+            # that is caused when trying compare CharDet's internal
+            # ASCII-8BIT RegExp with a String with other encoding
+            # because Zip::File#read returns a String with encoding ASCII-8BIT.
+            # So, no need to rescue the error here.
+            encoding = CharDet.detect(raw_content)['encoding']
+            if encoding
+              raw_content.force_encoding(encoding)
+            else
+              warn "No encoding detected for #{entry_name}. Set to ASCII-8BIT" if $DEBUG || $VERBOSE
+              raw_content
+            end
           end
           def xhtml?

data/lib/epub/searcher.rb ADDED

@@ -0,0 +1,3 @@
+require 'epub/searcher/result'
+require 'epub/searcher/publication'
+require 'epub/searcher/xhtml'

data/lib/epub/searcher/publication.rb ADDED

@@ -0,0 +1,32 @@
+require 'epub/publication'
+module EPUB
+  module Searcher
+    class Publication
+      class << self
+        def search(package, word)
+          new(word).search(package)
+        end
+      end
+      def initialize(word)
+        @word = word
+      end
+      def search(package)
+        results = []
+        spine = package.spine
+        spine_step = Result::Step.new(:element, 2, {:name => 'spine', :id => spine.id})
+        spine.each_itemref.with_index do |itemref, index|
+          itemref_step = Result::Step.new(:itemref, index, {:id => itemref.id})
+          XHTML::Restricted.search(Nokogiri.XML(itemref.item.read), @word).each do |sub_result|
+            results << Result.new([spine_step, itemref_step] + sub_result.parent_steps, sub_result.start_steps, sub_result.end_steps)
+          end
+        end
+        results
+      end
+    end
+  end
+end

data/lib/epub/searcher/result.rb ADDED

@@ -0,0 +1,73 @@
+module EPUB
+  module Searcher
+    class Result
+      attr_reader :parent_steps, :start_steps, :end_steps
+      # @param parent_steps [Array<Step>] common steps between start and end
+      # @param start_steps [Array<Step>] steps to start from +parent_steps+
+      # @param end_steps [Array<Step>] steps to end from +parent_steps+
+      def initialize(parent_steps, start_steps, end_steps)
+        @parent_steps, @start_steps, @end_steps = parent_steps, start_steps, end_steps
+      end
+      def to_xpath_and_offset(with_xmlns=false)
+        xpath = (@parent_steps + @start_steps).reduce('.') {|path, step|
+          case step.type
+          when :element
+            path + '/%s*[%d]' % [with_xmlns ? 'xhtml:' : nil, step.index + 1]
+          when :text
+            path + '/text()[%s]' % [step.index + 1]
+          else
+            path
+          end
+        }
+        [xpath, @start_steps.last.index]
+      end
+      def to_cfi_s
+        [@parent_steps, @start_steps, @end_steps].collect {|steps|
+          steps ? steps.collect(&:to_cfi_s).join : nil
+        }.compact.join(',')
+      end
+      def ==(other)
+        [@parent_steps + @start_steps.to_a] == [other.parent_steps + other.start_steps.to_a] and
+          [@parent_steps + @end_steps.to_a] == [other.parent_steps + other.end_steps.to_a]
+      end
+      class Step
+        attr_reader :type, :index, :info
+        def initialize(type, index, info={})
+          @type, @index, @info = type, index, info
+        end
+        def ==(other)
+          self.type == other.type and
+            self.index == other.index and
+            self.info == other.info
+        end
+        def to_cfi_s
+          case type
+          when :element
+            '/%d%s' % [(index + 1) * 2, id_assertion]
+          when :text
+            '/%d' % [(index + 1)]
+          when :character
+            ':%d' % [index]
+          when :itemref
+            '/%d%s!' % [(index + 1) * 2, id_assertion]
+          end
+        end
+        private
+        def id_assertion
+          info[:id] ? "[#{info[:id]}]" : nil
+        end
+      end
+    end
+  end
+end

data/lib/epub/searcher/xhtml.rb ADDED

@@ -0,0 +1,57 @@
+require 'epub'
+require 'epub/parser/utils'
+module EPUB
+  module Searcher
+    class XHTML
+      class Restricted
+        class << self
+          # @param element [Nokogiri::XML::Element, Nokogiri::XML::Document]
+          # @param word [String]
+          # @return [Array<Result>]
+          def search(element, word)
+            new(word).search(element.respond_to?(:root) ? element.root : element)
+          end
+        end
+        # @param word [String]
+        def initialize(word)
+          @word = word
+        end
+        # @param element [Nokogiri::XML::Element]
+        # @return [Array<Result>]
+        def search(element)
+          results = []
+          elem_index = 0
+          element.children.each do |child|
+            if child.element?
+              child_step = Result::Step.new(:element, elem_index, {:name => child.name, :id => Parser::Utils.extract_attribute(child, 'id')})
+              if child.name == 'img'
+                if Parser::Utils.extract_attribute(child, 'alt').index(@word)
+                  results << Result.new([child_step], nil, nil)
+                end
+              else
+                search(child).each do |sub_result|
+                  results << Result.new([child_step] + sub_result.parent_steps, sub_result.start_steps, sub_result.end_steps)
+                end
+              end
+              elem_index += 1
+            elsif child.text?
+              text_index = elem_index
+              char_index = 0
+              text_step = Result::Step.new(:text, text_index)
+              while char_index = child.text.index(@word, char_index)
+                results << Result.new([text_step], [Result::Step.new(:character, char_index)], [Result::Step.new(:character, char_index + @word.length)])
+                char_index += 1
+              end
+            end
+          end
+          results
+        end
+      end
+    end
+  end
+end

data/test/fixtures/book/OPS/japanese.eucjp.xhtml ADDED

@@ -0,0 +1,10 @@
+<?xml version="1.0" encoding="EUC-JP"?>
+<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
+  <head>
+    <meta charset="EUC-JP" />
+    <title>���ܸ�</title>
+  </head>
+  <body>
+    <h1>���ܸ�</h1>
+  </body>
+</html>

data/test/fixtures/book/OPS/japanese.sjis.xhtml ADDED

@@ -0,0 +1,10 @@
+<?xml version="1.0" encoding="Shift_JIS"?>
+<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
+  <head>
+    <meta charset="Shift_JIS" />
+    <title>���{��</title>
+  </head>
+  <body>
+    <h1>���{��</h1>
+  </body>
+</html>

data/test/fixtures/book/OPS/japanese.utf8.xhtml ADDED

@@ -0,0 +1,10 @@
+<?xml version="1.0"?>
+<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
+  <head>
+    <meta charset="UTF-8" />
+    <title>日本語</title>
+  </head>
+  <body>
+    <h1>日本語</h1>
+  </body>
+</html>

data/test/fixtures/book/OPS//343/203/253/343/203/274/343/203/210/343/203/225/343/202/241/343/202/244/343/203/253.opf CHANGED

@@ -101,6 +101,15 @@
     <item id="encoded-japanese-filename"
           href="%E6%97%A5%E6%9C%AC%E8%AA%9E.xhtml"
           media-type="application/xhtml+xml"/>
+    <item id="utf-8-encoded"
+          href="japanese.utf8.xhtml"
+          media-type="application/xhtml+xml"/>
+    <item id="euc-jp-encoded"
+          href="japanese.eucjp.xhtml"
+          media-type="application/xhtml+xml"/>
+    <item id="shift_jis-encoded"
+          href="japanese.sjis.xhtml"
+          media-type="application/xhtml+xml"/>
   </manifest>
   <spine>
     <itemref idref="nav"/>
@@ -116,4 +125,4 @@
     <mediaType handler="impl"
                media-type="application/x-demo-slideshow"/>
   </bindings>
-</package>
+</package>

data/test/test_parser_publication.rb CHANGED

@@ -81,8 +81,8 @@ class TestParserPublication < Test::Unit::TestCase
       @manifest = @parser.parse_manifest
     end
-    def test_manifest_has_16_items
-      assert_equal 16, @manifest.items.length
+    def test_manifest_has_19_items
+      assert_equal 19, @manifest.items.length
     end
     def test_item_has_relative_path_as_href_attribute

data/test/test_publication.rb CHANGED

@@ -239,6 +239,16 @@ class TestPublication < Test::Unit::TestCase
         assert_nil xhtml_item.find_item_by_relative_iri(Addressable::URI.parse('../image/01.png'))
       end
+      data('UTF-8'     => [Encoding::UTF_8,     'utf-8-encoded'],
+           'EUC-JP'    => [Encoding::EUC_JP,    'euc-jp-encoded'],
+           'Shift-JIS' => [Encoding::Shift_JIS, 'shift_jis-encoded'])
+      def test_read_detects_encoding_automatically(data)
+        encoding, id = data
+        epub = EPUB::Parser.parse('test/fixtures/book.epub')
+        item = epub.package.manifest[id]
+        assert_equal encoding, item.read.encoding
+      end
     end
   end

data/test/test_searcher.rb ADDED

@@ -0,0 +1,117 @@
+# -*- coding: utf-8 -*-
+require_relative 'helper'
+require 'epub/searcher'
+class TestSearcher < Test::Unit::TestCase
+  class TestPublication < self
+    def setup
+      super
+      opf_path = File.expand_path('../fixtures/book/OPS/ルートファイル.opf', __FILE__)
+      nav_path = File.expand_path('../fixtures/book/OPS/nav.xhtml', __FILE__)
+      @package = EPUB::Parser::Publication.new(open(opf_path), 'OPS/ルートファイル.opf').parse
+      @package.spine.each_itemref do |itemref|
+        stub(itemref.item).read {
+          itemref.idref == 'nav' ? File.read(nav_path) : '<html></html>'
+        }
+      end
+    end
+    def test_no_result
+      assert_empty EPUB::Searcher::Publication.search(@package, 'no result')
+    end
+    def test_simple
+      assert_equal(
+        results([
+          [[[:element, 2, {:name => 'spine', :id => nil}], [:itemref, 0, {:id => nil}], [:element, 0, {:name => 'head', :id => nil}], [:element, 0, {:name => 'title', :id => nil}], [:text, 0]], [[:character, 9]], [[:character, 16]]],
+          [[[:element, 2, {:name => 'spine', :id => nil}], [:itemref, 0, {:id => nil}], [:element, 1, {:name => 'body', :id => nil}], [:element, 0, {:name => 'div', :id => nil}], [:element, 0, {:name => 'nav', :id => 'idid'}], [:element, 0, {:name => 'hgroup', :id => nil}], [:element, 1, {:name => 'h1', :id => nil}], [:text, 0]], [[:character, 9]], [[:character, 16]]]
+        ]),
+        EPUB::Searcher::Publication.search(@package, 'Content')
+    )
+    end
+    class TesetResult < self
+      def test_to_cfi_s
+        assert_equal '/6/2!/4/2/2[idid]/2/4/1,:9,:16', EPUB::Searcher::Publication.search(@package, 'Content').last.to_cfi_s
+      end
+    end
+  end
+  class TestXHTML < self
+    def setup
+      super
+      nav_path = File.expand_path('../fixtures/book/OPS/nav.xhtml', __FILE__)
+      @doc = Nokogiri.XML(open(nav_path))
+      @h1 = @doc.search('h1').first
+      @nav = @doc.search('nav').first
+    end
+    def test_no_result
+      assert_empty EPUB::Searcher::XHTML::Restricted.search(@h1, 'no result')
+    end
+    def test_simple
+      assert_equal results([[[[:text, 0]], [[:character, 9]], [[:character, 16]]]]), EPUB::Searcher::XHTML::Restricted.search(@h1, 'Content')
+    end
+    def test_multiple_text_result
+      assert_equal results([[[[:text, 0]], [[:character, 6]], [[:character, 7]]], [[[:text, 0]], [[:character, 10]], [[:character, 11]]]]), EPUB::Searcher::XHTML::Restricted.search(@h1, 'o')
+    end
+    def test_text_after_element
+      elem = Nokogiri.XML('<root><elem>inner</elem>after</root>')
+      assert_equal results([[[[:text, 1]], [[:character, 0]], [[:character, 5]]]]), EPUB::Searcher::XHTML::Restricted.search(elem, 'after')
+    end
+    def test_entity_reference
+      elem = Nokogiri.XML('<root>before&lt;after</root>')
+      assert_equal results([[[[:text, 0]], [[:character, 6]], [[:character, 7]]]]), EPUB::Searcher::XHTML::Restricted.search(elem, '<')
+    end
+    def test_nested_result
+      assert_equal results([[[[:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 0, {:name => 'a', :id => nil}], [:text, 0]], [[:character, 0]], [[:character, 3]]]]), EPUB::Searcher::XHTML::Restricted.search(@nav, '第二節')
+    end
+    def test_img
+      assert_equal [result([[[:element, 1, {:name => 'ol', :id => nil}], [:element, 1, {:name => 'li', :id => nil}], [:element, 1, {:name => 'ol', :id => nil}], [:element, 2, {:name => 'li', :id => nil}], [:element, 0, {:name => 'a', :id => nil}], [:element, 0, {:name => 'img', :id => nil}]], nil, nil])], EPUB::Searcher::XHTML::Restricted.search(@nav, '第三節')
+    end
+    class TestResult < self
+      def setup
+        super
+        @result = EPUB::Searcher::XHTML::Restricted.search(@doc, '第二節').first
+      end
+      def test_to_xpath_and_offset
+        assert_equal ['./*[2]/*[1]/*[1]/*[2]/*[2]/*[2]/*[2]/*[1]/text()[1]', 0], @result.to_xpath_and_offset
+        assert_equal ['./xhtml:*[2]/xhtml:*[1]/xhtml:*[1]/xhtml:*[2]/xhtml:*[2]/xhtml:*[2]/xhtml:*[2]/xhtml:*[1]/text()[1]', 0], @result.to_xpath_and_offset(true)
+      end
+      def test_to_cfi_s
+        assert_equal '/4/2/2[idid]/4/4/4/4/2/1,:0,:3', @result.to_cfi_s
+      end
+      def test_to_cfi_s_img
+        assert_equal '/4/2/2[idid]/4/4/4/6/2/2', EPUB::Searcher::XHTML::Restricted.search(@doc, '第三節').first.to_cfi_s
+      end
+    end
+  end
+  private
+  def results(results)
+    results.collect {|res| result(res)}
+  end
+  def result(steps_triple)
+    EPUB::Searcher::Result.new(*steps_triple.collect {|steps|
+      steps ? steps.collect {|s| step(s)} : steps
+    })
+  end
+  def step(step)
+    EPUB::Searcher::Result::Step.new(*step)
+  end
+end

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: epub-parser
 version: !ruby/object:Gem::Version
-  version: 0.1.6
+  version: 0.1.7
 platform: ruby
 authors:
 - KITAITI Makoto
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2014-04-20 00:00:00.000000000 Z
+date: 2014-09-20 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rake
@@ -276,6 +276,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: 2.3.5
+- !ruby/object:Gem::Dependency
+  name: rchardet
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 description: Parse EPUB 3 book loosely
 email:
 - KitaitiMakoto@gmail.com
@@ -304,6 +318,7 @@ files:
 - docs/Item.markdown
 - docs/Navigation.markdown
 - docs/Publication.markdown
+- docs/Searcher.markdown
 - epub-parser.gemspec
 - features/epubinfo.feature
 - features/step_definitions/epubinfo_steps.rb
@@ -337,6 +352,10 @@ files:
 - lib/epub/publication/package/manifest.rb
 - lib/epub/publication/package/metadata.rb
 - lib/epub/publication/package/spine.rb
+- lib/epub/searcher.rb
+- lib/epub/searcher/publication.rb
+- lib/epub/searcher/result.rb
+- lib/epub/searcher/xhtml.rb
 - man/epubinfo.1.ronn
 - schemas/epub-nav-30.rnc
 - schemas/epub-nav-30.sch
@@ -347,6 +366,9 @@ files:
 - test/fixtures/book/OPS/case-sensitive.xhtml
 - test/fixtures/book/OPS/containing space.xhtml
 - test/fixtures/book/OPS/containing%20space.xhtml
+- test/fixtures/book/OPS/japanese.eucjp.xhtml
+- test/fixtures/book/OPS/japanese.sjis.xhtml
+- test/fixtures/book/OPS/japanese.utf8.xhtml
 - test/fixtures/book/OPS/nav.xhtml
 - test/fixtures/book/OPS/ルートファイル.opf
 - test/fixtures/book/OPS/日本語.xhtml
@@ -362,6 +384,7 @@ files:
 - test/test_parser_ocf.rb
 - test/test_parser_publication.rb
 - test/test_publication.rb
+- test/test_searcher.rb
 homepage: https://github.com/KitaitiMakoto/epub-parser
 licenses:
 - MIT
@@ -382,7 +405,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 2.2.0
+rubygems_version: 2.2.2
 signing_key:
 specification_version: 4
 summary: EPUB 3 Parser
@@ -401,4 +424,5 @@ test_files:
 - test/test_parser_ocf.rb
 - test/test_parser_publication.rb
 - test/test_publication.rb
+- test/test_searcher.rb
 has_rdoc: yard