RubyGems - docsplit_images - Versions diffs - 0.1.9 → 0.2.0 - Mend

docsplit_images 0.1.9 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

data/README.markdown +68 -26
data/VERSION +1 -1
data/docsplit_images.gemspec +3 -3
data/lib/docsplit_images/conversion.rb +26 -3
data/lib/docsplit_images.rb +1 -0
metadata +4 -4

data/README.markdown CHANGED Viewed

@@ -10,62 +10,80 @@ Docsplit images is used to convert a document file (pdf, xls, xlsx, ppt, pptx, d
 #### 1. Install GraphicsMagick. Its ‘gm’ command is used to generate images. Either compile it from source, or use a package manager:
-	[aptitude | port | brew] install graphicsmagick
+```bash
+[aptitude | port | brew] install graphicsmagick
+```
 #### 2. Install Poppler. On Linux, use aptitude, apt-get or yum:
-	aptitude install poppler-utils poppler-data
+```bash
+aptitude install poppler-utils poppler-data
+```
 On Mac, you can install from source or use MacPorts:
-	sudo port install poppler | brew install poppler
+```bash
+sudo port install poppler | brew install poppler
+```
 #### 3. (Optional) Install Ghostscript:
-	[aptitude | port | brew] install ghostscript
+```bash
+[aptitude | port | brew] install ghostscript
+```
 Ghostscript is required to convert PDF and Postscript files.
 #### 4. (Optional) Install Tesseract:
-	[aptitude | port | brew] install [tesseract | tesseract-ocr]
+```bash
+[aptitude | port | brew] install [tesseract | tesseract-ocr]
+```
 Without Tesseract installed, you'll still be able to extract text from documents, but you won't be able to automatically OCR them.
 #### 5. (Optional) Install pdftk. On Linux, use aptitude, apt-get or yum:
-	aptitude install pdftk
+```bash
+aptitude install pdftk
+```
 On the Mac, you can download a [http://www.pdflabs.com/docs/install-pdftk/](recent installer for the binary). Without pdftk installed, you can use Docsplit, but won't be able to split apart a multi-page PDF into single-page PDFs.
 #### 6. (Optional) Install OpenOffice. On Linux, use aptitude, apt-get or yum:
-	aptitude install openoffice.org openoffice.org-java-common
-  On the Mac, download and install the [http://www.openoffice.org/download/index.html](http://www.openoffice.org/download/index.html).
+```bash
+aptitude install openoffice.org openoffice.org-java-common
+```
+On Mac, download and install [http://www.openoffice.org/download/index.html](http://www.openoffice.org/download/index.html).
 ### Install Gem
-	gem 'docsplit_images', :git => 'git@github.com:jameshuynh/docsplit_images.git', tag: "v0.1.7"
+	gem 'docsplit_images', :git => 'git@github.com:jameshuynh/docsplit_images.git', tag: "v0.2.0"
 ## Setting Up
 From terminal, type the command to install
-	bundle
-	rails g docsplit_images <table_name> <attachment_field_name>
-	# e.g. rails generate docsplit_images asset document
-	rake db:migrate
+```bash
+bundle
+rails g docsplit_images <table_name> <attachment_field_name>
+# e.g. rails generate docsplit_images asset document
+rake db:migrate
+```
 In your model:
-	class Asset < ActiveRecord::Base
-	  ...
-	  attr_accessible :mydocument
-	  has_attached_file :mydocument
-	  docsplit_images_conversion_for :mydocument, {size: "800x"}
-	  ...
-	end
+```ruby
+class Asset < ActiveRecord::Base
+  ...
+  attr_accessible :mydocument
+  has_attached_file :mydocument
+  docsplit_images_conversion_for :mydocument, {size: "800x"}
+  ...
+end
+```
 ## Processing Images
@@ -75,15 +93,39 @@ docsplit_images requires delayed_job to be turned on the process.
 While it is processing using [https://github.com/collectiveidea/delayed_job](delayed_job), you can check if it is processing by accessing attribute ``is_processing_image``
-	asset.is_processing_image?
+```ruby
+asset.is_processing_image?
+```
+## Total number of pages
+* If your document file is not PDF, this will be non-zero after the internal conversion to PDF has been completed.
+```ruby
+asset.number_of_images_entry
+```
+## Checking the number of images which has been completed
+```ruby
+asset.number_of_completed_images
+```
+## Checking the overall conversion progress
+```ruby
+asset.images_conversion_progress
+# => 0.45 (which is 45%)
+```
 ## Accessing list of images using ``document_images_list``
 ``document_images_list`` will return a list of URL of images converting from the document
-	asset.document_images_list
-	# => ["/system/myfile_revisions/files/000/000/019/images/SBA_Admin_workflow_1.png", "/system/myfile_revisions/files/000/000/019/images/SBA_Admin_workflow_2.png", ...]
+```ruby
+asset.document_images_list
+# => ["/system/myfile_revisions/files/000/000/019/images/SBA_Admin_workflow_1.png", "/system/myfile_revisions/files/000/000/019/images/SBA_Admin_workflow_2.png", ...]
+```
 Contributing to docsplit_images
 -------------

data/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 0.1.9
1	+ 0.2.0

data/docsplit_images.gemspec CHANGED Viewed

@@ -5,11 +5,11 @@
 Gem::Specification.new do |s|
   s.name = "docsplit_images"
-  s.version = "0.1.9"
+  s.version = "0.2.0"
   s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
   s.authors = ["jameshuynh"]
-  s.date = "2013-04-19"
+  s.date = "2013-06-01"
   s.description = "Split Images for your document in one line of code"
   s.email = "james@rubify.com"
   s.extra_rdoc_files = [
@@ -37,7 +37,7 @@ Gem::Specification.new do |s|
   s.homepage = "http://github.com/jameshuynh/docsplit_images"
   s.licenses = ["MIT"]
   s.require_paths = ["lib"]
-  s.rubygems_version = "1.8.23"
+  s.rubygems_version = "1.8.25"
   s.summary = "Split Images for your document"
   if s.respond_to? :specification_version then

data/lib/docsplit_images/conversion.rb CHANGED Viewed

@@ -24,11 +24,34 @@ module DocsplitImages
         parent_dir = File.dirname(File.dirname(self.send(self.class.docsplit_attachment_name).path))
         FileUtils.rm_rf("#{parent_dir}/images")
         FileUtils.mkdir("#{parent_dir}/images")
-        Docsplit.extract_images(self.send(self.class.docsplit_attachment_name).path, self.class.docsplit_attachment_options.merge({:output => "#{parent_dir}/images"}))
-        self.number_of_images_entry = Dir.entries("#{parent_dir}/images").size - 2
+        doc_path = self.send(self.class.docsplit_attachment_name).path
+        ext = File.extname(doc_path)
+        temp_pdf_path = if ext.downcase == '.pdf'
+          doc_path
+        else
+          tempdir = File.join(Dir.tmpdir, 'docsplit')
+          Docsplit.extract_pdf([doc_path], {:output => tempdir})
+          File.join(tempdir, File.basename(doc, ext) + '.pdf')
+        end
+        self.number_of_images_entry = Docsplit.extract_length(temp_pdf_path)
+        self.save(validate: false)
+        # Going to convert to images
+        Docsplit::ImageExtractor.new.extract(temp_pdf_path, self.class.docsplit_attachment_options.merge({:output => "#{parent_dir}/images"}))
         @file_has_changed = false
         self.is_processing_image = false
-        self.save(:validate => false)
+        self.save(:validate => false)
+      end
+      def number_of_completed_images
+        parent_dir = File.dirname(File.dirname(self.send(self.class.docsplit_attachment_name).path))
+        return Dir.entries("#{parent_dir}/images").size - 2
+      end
+      # return the progress in term of percentage
+      def images_conversion_progress
+        return ("%.2f" % (number_of_completed_images * 1.0 / self.number_of_images_entry)).to_f if self.is_pdf_convertible?
+        return 1
       end
       ## paperclip overriding

data/lib/docsplit_images.rb CHANGED Viewed

@@ -1,4 +1,5 @@
 require 'rubygems'
+require 'docsplit'
 require 'docsplit_images/conversion'
 module DocsplitImages
   class Engine < Rails::Engine

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: docsplit_images
 version: !ruby/object:Gem::Version
-  version: 0.1.9
+  version: 0.2.0
   prerelease:
 platform: ruby
 authors:
@@ -9,7 +9,7 @@ authors:
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2013-04-19 00:00:00.000000000 Z
+date: 2013-06-01 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: paperclip
@@ -162,7 +162,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
       version: '0'
       segments:
       - 0
-      hash: 3877530749444095116
+      hash: -352764726966913135
 required_rubygems_version: !ruby/object:Gem::Requirement
   none: false
   requirements:
@@ -171,7 +171,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 1.8.23
+rubygems_version: 1.8.25
 signing_key:
 specification_version: 3
 summary: Split Images for your document