RubyGems - risbn - Versions diffs - 0.1.0 → 0.2.0 - Mend

risbn 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

data/README.rdoc CHANGED Viewed

@@ -1,9 +1,32 @@
 = risbn
-Description goes here.
+Minimal set of tools for working with isbns from ruby.
+Supports both isbn-10 and isbn-13.
+Provides a simple (barebones) tool for extracting isbns from pdf and chm files.
+== Examples:
+  isbn = RISBN.parse_first("Some text with and isbn: ISBN-13: 978-0393317732") # => <RISBN isbn="9780393317732">
+  isbn.valid? # => true
+  require 'risbn/scanner'
+  RISBN::Scanner.scan("some/file.pdf")
+  RISBN::Scanner.scan("some/file.chm")
+  RISBN::Scanner.scan("some/file.tx")
+== Notes
+Currently only works on unix like platforms.
+Requires the following tools for scanning files:
+* Poppler for pdf (pdftotext utility)
+* Archmage for chm
 == Note on Patches/Pull Requests
 * Fork the project.
 * Make your feature addition or bug fix.
 * Add tests for it. This is important so I don't break it in a

data/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 0.1.0
1	+ 0.2.0

data/lib/risbn.rb CHANGED Viewed

@@ -17,7 +17,7 @@ class RISBN
   # Provide a string with the isbn. Any non digit or X character will be removed.
   def initialize(code = "")
-    @isbn = code.to_s.upcase.gsub(/[^0-9X]/, "")
+    @isbn = (code || "").to_s.upcase.gsub(/[^0-9X]/, "")
   end
   def valid?

data/lib/risbn/scanner.rb ADDED Viewed

@@ -0,0 +1,55 @@
+require 'shellwords'
+require 'tmpdir'
+require 'iconv'
+class RISBN
+  # Scan a file for a isbn. Currently only text files, pdf and chm files are allowed.
+  # Uses unix 'file' command to identify the file.
+  # For pdf scanning uses poppler, for chm scanning uses archmage.
+  module Scanner
+    extend self
+    # provide a file path of a file to scan for the first found isbn.
+    # currently scans pdfs using poppler, chm using archmage and text files.
+    # Also, requires the unix utility "file"
+    def scan(path)
+      case identify(path)
+      when /PDF/      then scan_pdf(path)
+      when /HtmlHelp/ then scan_chm(path)
+      when /text/     then scan_txt(path)
+      end || RISBN.new
+    end
+    def identify(path)
+      File.file?(path) ? %x|file -F :::: #{path.to_s.shellescape}|.split("::::").last.strip : ""
+    end
+    def scan_chm(path)
+      Dir.mktmpdir do |dir|
+        tmp = File.join(dir, "tempfile.txt")
+        system("python -W ignore $(which archmage) -c text #{ path.to_s.shellescape } #{ tmp.to_s.shellescape } 2>&1 > /dev/null")
+        scan_txt(tmp)
+      end
+    end
+    def scan_pdf(path)
+      Dir.mktmpdir do |dir|
+        tmp = File.join(dir, "tempfile.txt")
+        system("pdftotext -q -f 0 -l 20 -raw -nopgbrk #{ path.to_s.shellescape } #{ tmp.to_s.shellescape }")
+        scan_txt(tmp)
+      end
+    end
+    def scan_txt(path)
+      IO.foreach(path) do |line|
+        isbn = RISBN.parse_first(line)
+        return isbn if isbn.valid?
+      end
+      nil
+    rescue # any problem with the text encoding
+      nil
+    end
+  end
+end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: risbn
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.2.0
 platform: ruby
 authors:
 - Emmanuel Oga
@@ -39,6 +39,7 @@ files:
 - Rakefile
 - VERSION
 - lib/risbn.rb
+- lib/risbn/scanner.rb
 - spec/risbn_spec.rb
 - spec/spec_helper.rb
 has_rdoc: true