grim 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d9b3ea24639d459434c20b2f23e417e13b811a6a
4
- data.tar.gz: d25c76a9a4ead4f3a3067eb40a364ff532460911
3
+ metadata.gz: 75e501e9b8b7daf4549a07bf29f3e34baa5099f1
4
+ data.tar.gz: 1a96dc37be69e9474305329c74b4a556b20d2a1d
5
5
  SHA512:
6
- metadata.gz: cb6007819bb63ba6b07c253e978bcad0b792adcfac87f96ff1ea3b11fa4e94621384bb182e8413abc08efa37f984a70f6d3f1643a2d922e7f9147b42ae98a487
7
- data.tar.gz: 64c20dc98e97cf1a341ef904d7c038feffe54acdd00e21c4d2f1dded2d5a756861e8e665d7910f9b94d0f457dd8734a0951110095c0b6f1b3116ea35d7866224
6
+ metadata.gz: 82ee2f6442b015ccd7716467ac18d8f0081e6741e72a36fb3ee48032287b9915785dbae78ff4e74a182bc0b801b55228f462f2898839a45d9b6f73683b9b8f3a
7
+ data.tar.gz: 54356b62ff9d386dedf2107d70efbba276d4be24c7168c190b8f9bddcf04b0dcc9b36186de49e3804e53ae0fa0832eed26cc4cbe48b103837f50dbed00211ecf
data/Gemfile CHANGED
@@ -1,4 +1,4 @@
1
1
  source "https://rubygems.org"
2
2
  gemspec
3
3
  gem "rake"
4
- gem "rspec"
4
+ gem "rspec", "~> 3.2.0"
@@ -0,0 +1,106 @@
1
+ ```
2
+ ,____
3
+ |---.\
4
+ ___ | `
5
+ / .-\ ./=)
6
+ | |"|_/\/|
7
+ ; |-;| /_|
8
+ / \_| |/ \ |
9
+ / \/\( |
10
+ | / |` ) |
11
+ / \ _/ |
12
+ /--._/ \ |
13
+ `/|) | /
14
+ / | |
15
+ .' | |
16
+ / \ |
17
+ (_.-.__.__./ /
18
+ ```
19
+
20
+ # Grim
21
+
22
+ Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
23
+
24
+ ## Prerequisites
25
+
26
+ You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using [Homebrew](http://mxcl.github.com/homebrew/) to get them installed.
27
+
28
+ ```bash
29
+ $ brew install ghostscript imagemagick poppler
30
+ ```
31
+
32
+ ## Installation
33
+
34
+ ```bash
35
+ $ gem install grim
36
+ ```
37
+
38
+ ## Usage
39
+
40
+ ```ruby
41
+ pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
42
+ count = pdf.count # returns the number of pages in the pdf
43
+ png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
44
+ text = pdf[3].text # returns text as a String
45
+
46
+ pdf.each do |page|
47
+ puts page.text
48
+ end
49
+ ```
50
+
51
+ We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
52
+
53
+ ```ruby
54
+ # specifying one processor with specific ImageMagick and GhostScript paths
55
+ Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
56
+
57
+ # multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
58
+ Grim.processor = Grim::MultiProcessor.new([
59
+ Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
60
+ Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
61
+ ])
62
+
63
+ pdf = Grim.reap('/path/to/pdf')
64
+ ```
65
+
66
+ You can even specify a Windows executable :zap:
67
+
68
+ ```ruby
69
+ # specifying another ghostscript executable, win64 in this example
70
+ # the ghostscript/bin folder still has to be in the PATH for this to work
71
+ Grim.processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
72
+
73
+ pdf = Grim.reap('/path/to/pdf')
74
+ ```
75
+
76
+ `Grim::ImageMagickProcessor#save` supports several options as well:
77
+
78
+ ```ruby
79
+ pdf = Grim.reap("/path/to/pdf")
80
+ pdf[0].save('/path/to/image.png', {
81
+ :width => 600, # defaults to 1024
82
+ :density => 72, # defaults to 300
83
+ :quality => 60, # defaults to 90
84
+ :colorspace => "CMYK", # defaults to "RGB"
85
+ :alpha => "Activate" # not used when not set
86
+ })
87
+ ```
88
+
89
+ ## Reference
90
+
91
+ * [jonmagic.com: Grim](http://jonmagic.com/blog/archives/2011/09/06/grim/)
92
+ * [jonmagic.com: Grim MultiProcessor](http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/)
93
+
94
+ ## Contributors
95
+
96
+ * [@jonmagic](https://github.com/jonmagic)
97
+ * [@jnunemaker](https://github.com/jnunemaker)
98
+ * [@bryckbost](https://github.com/bryckbost)
99
+ * [@bkeepers](https://github.com/bkeepers)
100
+ * [@BobaFaux](https://github.com/BobaFaux)
101
+ * [@Rubikan](https://github.com/Rubikan)
102
+ * [@victormier](https://github.com/victormier)
103
+
104
+ ## License
105
+
106
+ See [LICENSE](LICENSE) for details.
@@ -3,32 +3,44 @@ module Grim
3
3
 
4
4
  # ghostscript prints out a warning, this regex matches it
5
5
  WarningRegex = /\*\*\*\*.*\n/
6
+ DefaultImagemagickPath = 'convert'
7
+ DefaultGhostScriptPath = 'gs'
6
8
 
7
9
  def initialize(options={})
8
- @imagemagick_path = options[:imagemagick_path] || 'convert'
9
- @ghostscript_path = options[:ghostscript_path]
10
- @original_path = ENV['PATH']
10
+ @imagemagick_path = options[:imagemagick_path] || DefaultImagemagickPath
11
+ @ghostscript_path = options[:ghostscript_path] || DefaultGhostScriptPath
12
+ @original_path = ENV['PATH']
11
13
  end
12
14
 
13
15
  def count(path)
14
- command = ["-dNODISPLAY", "-q",
16
+ command = [@ghostscript_path, "-dNODISPLAY", "-q",
15
17
  "-sFile=#{Shellwords.shellescape(path)}",
16
18
  File.expand_path('../../../lib/pdf_info.ps', __FILE__)]
17
- @ghostscript_path ? command.unshift(@ghostscript_path) : command.unshift('gs')
18
19
  result = `#{command.join(' ')}`
19
20
  result.gsub(WarningRegex, '').to_i
20
21
  end
21
22
 
22
23
  def save(pdf, index, path, options)
23
- width = options.fetch(:width, Grim::WIDTH)
24
- density = options.fetch(:density, Grim::DENSITY)
25
- quality = options.fetch(:quality, Grim::QUALITY)
24
+ width = options.fetch(:width, Grim::WIDTH)
25
+ density = options.fetch(:density, Grim::DENSITY)
26
+ quality = options.fetch(:quality, Grim::QUALITY)
26
27
  colorspace = options.fetch(:colorspace, Grim::COLORSPACE)
27
- command = [@imagemagick_path, "-resize", width.to_s, "-antialias", "-render",
28
- "-quality", quality.to_s, "-colorspace", colorspace,
29
- "-interlace", "none", "-density", density.to_s,
30
- "#{Shellwords.shellescape(pdf.path)}[#{index}]", path]
31
- command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path
28
+ alpha = options[:alpha]
29
+
30
+ command = []
31
+ command << @imagemagick_path
32
+ command << "-resize #{width}"
33
+ command << "-alpha #{alpha}" if alpha
34
+ command << "-antialias"
35
+ command << "-render"
36
+ command << "-quality #{quality}"
37
+ command << "-colorspace #{colorspace}"
38
+ command << "-interlace none"
39
+ command << "-density #{density}"
40
+ command << "#{Shellwords.shellescape(pdf.path)}[#{index}]"
41
+ command << path
42
+
43
+ command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path && @ghostscript_path != DefaultGhostScriptPath
32
44
 
33
45
  result = `#{command.join(' ')}`
34
46
 
@@ -1,4 +1,4 @@
1
1
  # encoding: UTF-8
2
2
  module Grim
3
- VERSION = "1.1.0" unless defined?(::Grim::VERSION)
3
+ VERSION = "1.2.0" unless defined?(::Grim::VERSION)
4
4
  end
@@ -16,7 +16,17 @@ describe Grim::ImageMagickProcessor do
16
16
  end
17
17
 
18
18
  it "should return page count" do
19
- @processor.count(fixture_path("smoker.pdf")).should == 25
19
+ expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
20
+ end
21
+ end
22
+
23
+ describe "#count with windows executable", :windows => true do
24
+ before(:each) do
25
+ @processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
26
+ end
27
+
28
+ it "should return page count" do
29
+ expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
20
30
  end
21
31
  end
22
32
 
@@ -30,13 +40,13 @@ describe Grim::ImageMagickProcessor do
30
40
 
31
41
  it "should create the file" do
32
42
  @processor.save(@pdf, 0, @path, {})
33
- File.exist?(@path).should be_true
43
+ expect(File.exist?(@path)).to be(true)
34
44
  end
35
45
 
36
46
  it "should use default width of 1024" do
37
47
  @processor.save(@pdf, 0, @path, {})
38
48
  width, height = dimensions_for_path(@path)
39
- width.should == 1024
49
+ expect(width).to eq(1024)
40
50
  end
41
51
  end
42
52
 
@@ -50,7 +60,7 @@ describe Grim::ImageMagickProcessor do
50
60
 
51
61
  it "should set width" do
52
62
  width, height = dimensions_for_path(@path)
53
- width.should == 20
63
+ expect(width).to eq(20)
54
64
  end
55
65
  end
56
66
 
@@ -67,7 +77,7 @@ describe Grim::ImageMagickProcessor do
67
77
  Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:quality => 90})
68
78
  higher_size = File.size(@path)
69
79
 
70
- (lower_size < higher_size).should be_true
80
+ expect(lower_size < higher_size).to be(true)
71
81
  end
72
82
  end
73
83
 
@@ -81,7 +91,7 @@ describe Grim::ImageMagickProcessor do
81
91
  lower_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 72}) }
82
92
  higher_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 300}) }
83
93
 
84
- (lower_time < higher_time).should be_true
94
+ expect(lower_time < higher_time).to be(true)
85
95
  end
86
96
  end
87
97
 
@@ -99,7 +109,23 @@ describe Grim::ImageMagickProcessor do
99
109
  file1_size = File.stat(@path1).size
100
110
  file2_size = File.stat(@path2).size
101
111
 
102
- file1_size.should_not == file2_size
112
+ expect(file1_size).to_not eq(file2_size)
113
+ end
114
+ end
115
+
116
+ describe "#save with alpha option" do
117
+ before(:each) do
118
+ @path1 = tmp_path("to_png_spec-1.png")
119
+ @path2 = tmp_path("to_png_spec-2.png")
120
+ @pdf = Grim::Pdf.new(fixture_path("remove_alpha.pdf"))
121
+ end
122
+
123
+ it "should use alpha" do
124
+ Grim::ImageMagickProcessor.new.save(@pdf, 0, @path1, {:alpha => 'Set'})
125
+ Grim::ImageMagickProcessor.new.save(@pdf, 0, @path2, {:alpha => 'Remove'})
126
+
127
+ expect(`convert #{@path1} -verbose info:`.include?("alpha: 8-bit")).to be(true)
128
+ expect(`convert #{@path2} -verbose info:`.include?("alpha: 1-bit")).to be(true)
103
129
  end
104
130
  end
105
- end
131
+ end
@@ -14,9 +14,9 @@ describe Grim::MultiProcessor do
14
14
 
15
15
  describe "#count" do
16
16
  it "should try processors until it succeeds" do
17
- @failure.stub(:count){""}
18
- @success.should_receive(:count).and_return(30)
19
- @extra.should_not_receive(:count)
17
+ allow(@failure).to receive(:count).and_return("")
18
+ expect(@success).to receive(:count).and_return(30)
19
+ expect(@extra).to_not receive(:count)
20
20
 
21
21
  @processor.count(@path)
22
22
  end
@@ -24,19 +24,19 @@ describe Grim::MultiProcessor do
24
24
 
25
25
  describe "#save" do
26
26
  it "should try processors until it succeeds" do
27
- @failure.stub(:save){false}
28
- @success.should_receive(:save).and_return(true)
29
- @extra.should_not_receive(:save)
27
+ allow(@failure).to receive(:save).and_return(false)
28
+ expect(@success).to receive(:save).and_return(true)
29
+ expect(@extra).to_not receive(:save)
30
30
 
31
31
  @processor.save(@pdf, 0, @path, {})
32
32
  end
33
33
 
34
34
  it "should raise error if all processors fail" do
35
- @failure.should_receive(:save).and_return(false)
36
- @success.should_receive(:save).and_return(false)
37
- @extra.should_receive(:save).and_return(false)
35
+ expect(@failure).to receive(:save).and_return(false)
36
+ expect(@success).to receive(:save).and_return(false)
37
+ expect(@extra).to receive(:save).and_return(false)
38
38
 
39
- lambda { @processor.save(@pdf, 0, @path, {}) }.should raise_error(Grim::UnprocessablePage)
39
+ expect { @processor.save(@pdf, 0, @path, {}) }.to raise_error(Grim::UnprocessablePage)
40
40
  end
41
41
  end
42
- end
42
+ end
@@ -8,7 +8,7 @@ describe Grim::Page do
8
8
  end
9
9
 
10
10
  it "should have number" do
11
- Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number.should == 2
11
+ expect(Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number).to eq(2)
12
12
  end
13
13
 
14
14
  describe "#save" do
@@ -18,7 +18,7 @@ describe Grim::Page do
18
18
  end
19
19
 
20
20
  it "should call Grim.processor.save with pdf, index, path, and options" do
21
- Grim.processor.should_receive(:save).with(@pdf, 0, @path, {})
21
+ expect(Grim.processor).to receive(:save).with(@pdf, 0, @path, {})
22
22
  @pdf[0].save(@path)
23
23
  end
24
24
  end
@@ -30,8 +30,8 @@ describe Grim::Page do
30
30
  end
31
31
 
32
32
  it "raises an exception" do
33
- lambda { @pdf[0].save(nil) }.should raise_error(Grim::PathMissing)
34
- lambda { @pdf[0].save(' ') }.should raise_error(Grim::PathMissing)
33
+ expect { @pdf[0].save(nil) }.to raise_error(Grim::PathMissing)
34
+ expect { @pdf[0].save(' ') }.to raise_error(Grim::PathMissing)
35
35
  end
36
36
  end
37
37
 
@@ -47,13 +47,15 @@ describe Grim::Page do
47
47
  describe "#text" do
48
48
  it "should return the text from the selected page" do
49
49
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
50
- pdf[1].text.should == "Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f"
50
+ expect(pdf[1].text).to \
51
+ eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
51
52
  end
52
53
 
53
54
  it "works with full path to pdftotext" do
54
55
  pdftotext_path = `which pdftotext`.chomp
55
56
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"), pdftotext_path: pdftotext_path)
56
- pdf[1].text.should == "Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f"
57
+ expect(pdf[1].text).to \
58
+ eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
57
59
  end
58
60
  end
59
61
  end
@@ -4,23 +4,26 @@ require 'spec_helper'
4
4
  describe Grim::Pdf do
5
5
 
6
6
  it "should have a path" do
7
- Grim::Pdf.new(fixture_path("smoker.pdf")).path.should == fixture_path("smoker.pdf")
7
+ pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
8
+ expect(pdf.path).to eq(fixture_path("smoker.pdf"))
8
9
  end
9
10
 
10
11
  describe "#initialize" do
11
12
  it "should raise an error if pdf does not exist" do
12
- lambda { Grim::Pdf.new(fixture_path("booboo.pdf")) }.should raise_error(Grim::PdfNotFound)
13
+ expect {
14
+ Grim::Pdf.new(fixture_path("booboo.pdf"))
15
+ }.to raise_error(Grim::PdfNotFound)
13
16
  end
14
17
 
15
18
  it "should set path on pdf" do
16
19
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
17
- pdf.path.should == fixture_path("smoker.pdf")
20
+ expect(pdf.path).to eq(fixture_path("smoker.pdf"))
18
21
  end
19
22
  end
20
23
 
21
24
  describe "#count" do
22
25
  it "should call Grim.processor.count with pdf path" do
23
- Grim.processor.should_receive(:count).with(fixture_path("smoker.pdf"))
26
+ expect(Grim.processor).to receive(:count).with(fixture_path("smoker.pdf"))
24
27
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
25
28
  pdf.count
26
29
  end
@@ -32,19 +35,19 @@ describe Grim::Pdf do
32
35
  end
33
36
 
34
37
  it "should raise Grim::PageDoesNotExist if page doesn't exist" do
35
- lambda { @pdf[25] }.should raise_error(Grim::PageNotFound)
38
+ expect { @pdf[25] }.to raise_error(Grim::PageNotFound)
36
39
  end
37
40
 
38
41
  it "should return an instance of Grim::Page if page exists" do
39
- @pdf[24].class.should == Grim::Page
42
+ expect(@pdf[24].class).to eq(Grim::Page)
40
43
  end
41
44
  end
42
45
 
43
46
  describe "#each" do
44
47
  it "should be iterable" do
45
48
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
46
- pdf.map {|p| p.number }.should == (1..25).to_a
49
+ expect(pdf.map {|p| p.number }).to eq((1..25).to_a)
47
50
  end
48
51
  end
49
52
 
50
- end
53
+ end
@@ -3,32 +3,32 @@ require 'spec_helper'
3
3
 
4
4
  describe Grim do
5
5
  it "should have a default processor" do
6
- Grim.processor.class.should == Grim::ImageMagickProcessor
6
+ expect(Grim.processor.class).to eq(Grim::ImageMagickProcessor)
7
7
  end
8
8
 
9
9
  it "should have a VERSION constant" do
10
- Grim.const_defined?('VERSION').should be_true
10
+ expect(Grim.const_defined?('VERSION')).to be(true)
11
11
  end
12
12
 
13
13
  it "should have WIDTH constant set to 1024" do
14
- Grim::WIDTH.should == 1024
14
+ expect(Grim::WIDTH).to eq(1024)
15
15
  end
16
16
 
17
17
  it "should have QUALITY constant set to 90" do
18
- Grim::QUALITY.should == 90
18
+ expect(Grim::QUALITY).to eq(90)
19
19
  end
20
20
 
21
21
  it "should have DENSITY constant set to 300" do
22
- Grim::DENSITY.should == 300
22
+ expect(Grim::DENSITY).to eq(300)
23
23
  end
24
24
 
25
25
  it "should have COLORSPACE constant set to 'RGB'" do
26
- Grim::COLORSPACE.should == 'RGB'
26
+ expect(Grim::COLORSPACE).to eq('RGB')
27
27
  end
28
28
 
29
29
  describe "#reap" do
30
30
  it "should return an instance of Grim::Pdf" do
31
- Grim.reap(fixture_path("smoker.pdf")).class.should == Grim::Pdf
31
+ expect(Grim.reap(fixture_path("smoker.pdf")).class).to eq(Grim::Pdf)
32
32
  end
33
33
  end
34
- end
34
+ end
@@ -2,6 +2,7 @@
2
2
  require 'benchmark'
3
3
  require 'rubygems'
4
4
  require 'bundler/setup'
5
+ require 'rbconfig'
5
6
 
6
7
  require 'grim'
7
8
 
@@ -28,4 +29,5 @@ end
28
29
 
29
30
  RSpec.configure do |config|
30
31
  config.include(FileHelpers)
32
+ config.filter_run_excluding :windows => true if RbConfig::CONFIG['host_os'].match(/mswin|mingw|cygwin/) == nil
31
33
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: grim
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jonathan Hoyt
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-11-25 00:00:00.000000000 Z
11
+ date: 2015-02-19 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: Grim is a simple gem for extracting a page from a pdf and converting
14
14
  it to an image as well as extract the text from the page as a string. It basically
@@ -23,7 +23,7 @@ files:
23
23
  - ".gitignore"
24
24
  - Gemfile
25
25
  - LICENSE
26
- - README.textile
26
+ - README.md
27
27
  - Rakefile
28
28
  - grim.gemspec
29
29
  - lib/grim.rb
@@ -33,6 +33,7 @@ files:
33
33
  - lib/grim/pdf.rb
34
34
  - lib/grim/version.rb
35
35
  - lib/pdf_info.ps
36
+ - spec/fixtures/remove_alpha.pdf
36
37
  - spec/fixtures/smoker.pdf
37
38
  - spec/fixtures/unprocessable.pdf
38
39
  - spec/lib/grim/image_magick_processor_spec.rb
@@ -65,6 +66,7 @@ signing_key:
65
66
  specification_version: 4
66
67
  summary: Extract slides and text from a PDF.
67
68
  test_files:
69
+ - spec/fixtures/remove_alpha.pdf
68
70
  - spec/fixtures/smoker.pdf
69
71
  - spec/fixtures/unprocessable.pdf
70
72
  - spec/lib/grim/image_magick_processor_spec.rb
@@ -1,81 +0,0 @@
1
- <pre>
2
- ,____
3
- |---.\
4
- ___ | `
5
- / .-\ ./=)
6
- | |"|_/\/|
7
- ; |-;| /_|
8
- / \_| |/ \ |
9
- / \/\( |
10
- | / |` ) |
11
- / \ _/ |
12
- /--._/ \ |
13
- `/|) | /
14
- / | |
15
- .' | |
16
- / \ |
17
- (_.-.__.__./ /
18
- </pre>
19
-
20
- h1. Grim
21
-
22
- Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
23
-
24
- h2. Prerequisites
25
-
26
- You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using "Homebrew":http://mxcl.github.com/homebrew/ to get them installed.
27
-
28
- <pre><code>
29
- brew install ghostscript imagemagick poppler
30
- </code></pre>
31
-
32
- h2. Installation
33
-
34
- <pre><code>
35
- gem install grim
36
- </code></pre>
37
-
38
- h2. Usage
39
-
40
- <pre><code>
41
- pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
42
- count = pdf.count # returns the number of pages in the pdf
43
- png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
44
- text = pdf[3].text # returns text as a String
45
-
46
- pdf.each do |page|
47
- puts page.text
48
- end
49
- </pre></code>
50
-
51
- We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
52
-
53
- <pre><code>
54
- # specifying one processor with specific ImageMagick and GhostScript paths
55
- Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
56
-
57
- # multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
58
- Grim.processor = Grim::MultiProcessor.new([
59
- Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
60
- Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
61
- ])
62
-
63
- pdf = Grim.reap('/path/to/pdf)
64
- </code></pre>
65
-
66
- h2. Reference
67
-
68
- * "jonmagic.com: Grim":http://jonmagic.com/blog/archives/2011/09/06/grim/
69
- * "jonmagic.com: Grim MultiProcessor":http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/
70
-
71
- h2. Contributors
72
-
73
- * "@jonmagic":https://github.com/jonmagic
74
- * "@jnunemaker":https://github.com/jnunemaker
75
- * "@bryckbost":https://github.com/bryckbost
76
- * "@bkeepers":https://github.com/bkeepers
77
- * "@BobaFaux":https://github.com/BobaFaux
78
-
79
- h2. License
80
-
81
- See LICENSE for details.