grim 1.1.0 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d9b3ea24639d459434c20b2f23e417e13b811a6a
4
- data.tar.gz: d25c76a9a4ead4f3a3067eb40a364ff532460911
3
+ metadata.gz: 75e501e9b8b7daf4549a07bf29f3e34baa5099f1
4
+ data.tar.gz: 1a96dc37be69e9474305329c74b4a556b20d2a1d
5
5
  SHA512:
6
- metadata.gz: cb6007819bb63ba6b07c253e978bcad0b792adcfac87f96ff1ea3b11fa4e94621384bb182e8413abc08efa37f984a70f6d3f1643a2d922e7f9147b42ae98a487
7
- data.tar.gz: 64c20dc98e97cf1a341ef904d7c038feffe54acdd00e21c4d2f1dded2d5a756861e8e665d7910f9b94d0f457dd8734a0951110095c0b6f1b3116ea35d7866224
6
+ metadata.gz: 82ee2f6442b015ccd7716467ac18d8f0081e6741e72a36fb3ee48032287b9915785dbae78ff4e74a182bc0b801b55228f462f2898839a45d9b6f73683b9b8f3a
7
+ data.tar.gz: 54356b62ff9d386dedf2107d70efbba276d4be24c7168c190b8f9bddcf04b0dcc9b36186de49e3804e53ae0fa0832eed26cc4cbe48b103837f50dbed00211ecf
data/Gemfile CHANGED
@@ -1,4 +1,4 @@
1
1
  source "https://rubygems.org"
2
2
  gemspec
3
3
  gem "rake"
4
- gem "rspec"
4
+ gem "rspec", "~> 3.2.0"
@@ -0,0 +1,106 @@
1
+ ```
2
+ ,____
3
+ |---.\
4
+ ___ | `
5
+ / .-\ ./=)
6
+ | |"|_/\/|
7
+ ; |-;| /_|
8
+ / \_| |/ \ |
9
+ / \/\( |
10
+ | / |` ) |
11
+ / \ _/ |
12
+ /--._/ \ |
13
+ `/|) | /
14
+ / | |
15
+ .' | |
16
+ / \ |
17
+ (_.-.__.__./ /
18
+ ```
19
+
20
+ # Grim
21
+
22
+ Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
23
+
24
+ ## Prerequisites
25
+
26
+ You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using [Homebrew](http://mxcl.github.com/homebrew/) to get them installed.
27
+
28
+ ```bash
29
+ $ brew install ghostscript imagemagick poppler
30
+ ```
31
+
32
+ ## Installation
33
+
34
+ ```bash
35
+ $ gem install grim
36
+ ```
37
+
38
+ ## Usage
39
+
40
+ ```ruby
41
+ pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
42
+ count = pdf.count # returns the number of pages in the pdf
43
+ png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
44
+ text = pdf[3].text # returns text as a String
45
+
46
+ pdf.each do |page|
47
+ puts page.text
48
+ end
49
+ ```
50
+
51
+ We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
52
+
53
+ ```ruby
54
+ # specifying one processor with specific ImageMagick and GhostScript paths
55
+ Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
56
+
57
+ # multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
58
+ Grim.processor = Grim::MultiProcessor.new([
59
+ Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
60
+ Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
61
+ ])
62
+
63
+ pdf = Grim.reap('/path/to/pdf')
64
+ ```
65
+
66
+ You can even specify a Windows executable :zap:
67
+
68
+ ```ruby
69
+ # specifying another ghostscript executable, win64 in this example
70
+ # the ghostscript/bin folder still has to be in the PATH for this to work
71
+ Grim.processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
72
+
73
+ pdf = Grim.reap('/path/to/pdf')
74
+ ```
75
+
76
+ `Grim::ImageMagickProcessor#save` supports several options as well:
77
+
78
+ ```ruby
79
+ pdf = Grim.reap("/path/to/pdf")
80
+ pdf[0].save('/path/to/image.png', {
81
+ :width => 600, # defaults to 1024
82
+ :density => 72, # defaults to 300
83
+ :quality => 60, # defaults to 90
84
+ :colorspace => "CMYK", # defaults to "RGB"
85
+ :alpha => "Activate" # not used when not set
86
+ })
87
+ ```
88
+
89
+ ## Reference
90
+
91
+ * [jonmagic.com: Grim](http://jonmagic.com/blog/archives/2011/09/06/grim/)
92
+ * [jonmagic.com: Grim MultiProcessor](http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/)
93
+
94
+ ## Contributors
95
+
96
+ * [@jonmagic](https://github.com/jonmagic)
97
+ * [@jnunemaker](https://github.com/jnunemaker)
98
+ * [@bryckbost](https://github.com/bryckbost)
99
+ * [@bkeepers](https://github.com/bkeepers)
100
+ * [@BobaFaux](https://github.com/BobaFaux)
101
+ * [@Rubikan](https://github.com/Rubikan)
102
+ * [@victormier](https://github.com/victormier)
103
+
104
+ ## License
105
+
106
+ See [LICENSE](LICENSE) for details.
@@ -3,32 +3,44 @@ module Grim
3
3
 
4
4
  # ghostscript prints out a warning, this regex matches it
5
5
  WarningRegex = /\*\*\*\*.*\n/
6
+ DefaultImagemagickPath = 'convert'
7
+ DefaultGhostScriptPath = 'gs'
6
8
 
7
9
  def initialize(options={})
8
- @imagemagick_path = options[:imagemagick_path] || 'convert'
9
- @ghostscript_path = options[:ghostscript_path]
10
- @original_path = ENV['PATH']
10
+ @imagemagick_path = options[:imagemagick_path] || DefaultImagemagickPath
11
+ @ghostscript_path = options[:ghostscript_path] || DefaultGhostScriptPath
12
+ @original_path = ENV['PATH']
11
13
  end
12
14
 
13
15
  def count(path)
14
- command = ["-dNODISPLAY", "-q",
16
+ command = [@ghostscript_path, "-dNODISPLAY", "-q",
15
17
  "-sFile=#{Shellwords.shellescape(path)}",
16
18
  File.expand_path('../../../lib/pdf_info.ps', __FILE__)]
17
- @ghostscript_path ? command.unshift(@ghostscript_path) : command.unshift('gs')
18
19
  result = `#{command.join(' ')}`
19
20
  result.gsub(WarningRegex, '').to_i
20
21
  end
21
22
 
22
23
  def save(pdf, index, path, options)
23
- width = options.fetch(:width, Grim::WIDTH)
24
- density = options.fetch(:density, Grim::DENSITY)
25
- quality = options.fetch(:quality, Grim::QUALITY)
24
+ width = options.fetch(:width, Grim::WIDTH)
25
+ density = options.fetch(:density, Grim::DENSITY)
26
+ quality = options.fetch(:quality, Grim::QUALITY)
26
27
  colorspace = options.fetch(:colorspace, Grim::COLORSPACE)
27
- command = [@imagemagick_path, "-resize", width.to_s, "-antialias", "-render",
28
- "-quality", quality.to_s, "-colorspace", colorspace,
29
- "-interlace", "none", "-density", density.to_s,
30
- "#{Shellwords.shellescape(pdf.path)}[#{index}]", path]
31
- command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path
28
+ alpha = options[:alpha]
29
+
30
+ command = []
31
+ command << @imagemagick_path
32
+ command << "-resize #{width}"
33
+ command << "-alpha #{alpha}" if alpha
34
+ command << "-antialias"
35
+ command << "-render"
36
+ command << "-quality #{quality}"
37
+ command << "-colorspace #{colorspace}"
38
+ command << "-interlace none"
39
+ command << "-density #{density}"
40
+ command << "#{Shellwords.shellescape(pdf.path)}[#{index}]"
41
+ command << path
42
+
43
+ command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path && @ghostscript_path != DefaultGhostScriptPath
32
44
 
33
45
  result = `#{command.join(' ')}`
34
46
 
@@ -1,4 +1,4 @@
1
1
  # encoding: UTF-8
2
2
  module Grim
3
- VERSION = "1.1.0" unless defined?(::Grim::VERSION)
3
+ VERSION = "1.2.0" unless defined?(::Grim::VERSION)
4
4
  end
@@ -16,7 +16,17 @@ describe Grim::ImageMagickProcessor do
16
16
  end
17
17
 
18
18
  it "should return page count" do
19
- @processor.count(fixture_path("smoker.pdf")).should == 25
19
+ expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
20
+ end
21
+ end
22
+
23
+ describe "#count with windows executable", :windows => true do
24
+ before(:each) do
25
+ @processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
26
+ end
27
+
28
+ it "should return page count" do
29
+ expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
20
30
  end
21
31
  end
22
32
 
@@ -30,13 +40,13 @@ describe Grim::ImageMagickProcessor do
30
40
 
31
41
  it "should create the file" do
32
42
  @processor.save(@pdf, 0, @path, {})
33
- File.exist?(@path).should be_true
43
+ expect(File.exist?(@path)).to be(true)
34
44
  end
35
45
 
36
46
  it "should use default width of 1024" do
37
47
  @processor.save(@pdf, 0, @path, {})
38
48
  width, height = dimensions_for_path(@path)
39
- width.should == 1024
49
+ expect(width).to eq(1024)
40
50
  end
41
51
  end
42
52
 
@@ -50,7 +60,7 @@ describe Grim::ImageMagickProcessor do
50
60
 
51
61
  it "should set width" do
52
62
  width, height = dimensions_for_path(@path)
53
- width.should == 20
63
+ expect(width).to eq(20)
54
64
  end
55
65
  end
56
66
 
@@ -67,7 +77,7 @@ describe Grim::ImageMagickProcessor do
67
77
  Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:quality => 90})
68
78
  higher_size = File.size(@path)
69
79
 
70
- (lower_size < higher_size).should be_true
80
+ expect(lower_size < higher_size).to be(true)
71
81
  end
72
82
  end
73
83
 
@@ -81,7 +91,7 @@ describe Grim::ImageMagickProcessor do
81
91
  lower_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 72}) }
82
92
  higher_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 300}) }
83
93
 
84
- (lower_time < higher_time).should be_true
94
+ expect(lower_time < higher_time).to be(true)
85
95
  end
86
96
  end
87
97
 
@@ -99,7 +109,23 @@ describe Grim::ImageMagickProcessor do
99
109
  file1_size = File.stat(@path1).size
100
110
  file2_size = File.stat(@path2).size
101
111
 
102
- file1_size.should_not == file2_size
112
+ expect(file1_size).to_not eq(file2_size)
113
+ end
114
+ end
115
+
116
+ describe "#save with alpha option" do
117
+ before(:each) do
118
+ @path1 = tmp_path("to_png_spec-1.png")
119
+ @path2 = tmp_path("to_png_spec-2.png")
120
+ @pdf = Grim::Pdf.new(fixture_path("remove_alpha.pdf"))
121
+ end
122
+
123
+ it "should use alpha" do
124
+ Grim::ImageMagickProcessor.new.save(@pdf, 0, @path1, {:alpha => 'Set'})
125
+ Grim::ImageMagickProcessor.new.save(@pdf, 0, @path2, {:alpha => 'Remove'})
126
+
127
+ expect(`convert #{@path1} -verbose info:`.include?("alpha: 8-bit")).to be(true)
128
+ expect(`convert #{@path2} -verbose info:`.include?("alpha: 1-bit")).to be(true)
103
129
  end
104
130
  end
105
- end
131
+ end
@@ -14,9 +14,9 @@ describe Grim::MultiProcessor do
14
14
 
15
15
  describe "#count" do
16
16
  it "should try processors until it succeeds" do
17
- @failure.stub(:count){""}
18
- @success.should_receive(:count).and_return(30)
19
- @extra.should_not_receive(:count)
17
+ allow(@failure).to receive(:count).and_return("")
18
+ expect(@success).to receive(:count).and_return(30)
19
+ expect(@extra).to_not receive(:count)
20
20
 
21
21
  @processor.count(@path)
22
22
  end
@@ -24,19 +24,19 @@ describe Grim::MultiProcessor do
24
24
 
25
25
  describe "#save" do
26
26
  it "should try processors until it succeeds" do
27
- @failure.stub(:save){false}
28
- @success.should_receive(:save).and_return(true)
29
- @extra.should_not_receive(:save)
27
+ allow(@failure).to receive(:save).and_return(false)
28
+ expect(@success).to receive(:save).and_return(true)
29
+ expect(@extra).to_not receive(:save)
30
30
 
31
31
  @processor.save(@pdf, 0, @path, {})
32
32
  end
33
33
 
34
34
  it "should raise error if all processors fail" do
35
- @failure.should_receive(:save).and_return(false)
36
- @success.should_receive(:save).and_return(false)
37
- @extra.should_receive(:save).and_return(false)
35
+ expect(@failure).to receive(:save).and_return(false)
36
+ expect(@success).to receive(:save).and_return(false)
37
+ expect(@extra).to receive(:save).and_return(false)
38
38
 
39
- lambda { @processor.save(@pdf, 0, @path, {}) }.should raise_error(Grim::UnprocessablePage)
39
+ expect { @processor.save(@pdf, 0, @path, {}) }.to raise_error(Grim::UnprocessablePage)
40
40
  end
41
41
  end
42
- end
42
+ end
@@ -8,7 +8,7 @@ describe Grim::Page do
8
8
  end
9
9
 
10
10
  it "should have number" do
11
- Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number.should == 2
11
+ expect(Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number).to eq(2)
12
12
  end
13
13
 
14
14
  describe "#save" do
@@ -18,7 +18,7 @@ describe Grim::Page do
18
18
  end
19
19
 
20
20
  it "should call Grim.processor.save with pdf, index, path, and options" do
21
- Grim.processor.should_receive(:save).with(@pdf, 0, @path, {})
21
+ expect(Grim.processor).to receive(:save).with(@pdf, 0, @path, {})
22
22
  @pdf[0].save(@path)
23
23
  end
24
24
  end
@@ -30,8 +30,8 @@ describe Grim::Page do
30
30
  end
31
31
 
32
32
  it "raises an exception" do
33
- lambda { @pdf[0].save(nil) }.should raise_error(Grim::PathMissing)
34
- lambda { @pdf[0].save(' ') }.should raise_error(Grim::PathMissing)
33
+ expect { @pdf[0].save(nil) }.to raise_error(Grim::PathMissing)
34
+ expect { @pdf[0].save(' ') }.to raise_error(Grim::PathMissing)
35
35
  end
36
36
  end
37
37
 
@@ -47,13 +47,15 @@ describe Grim::Page do
47
47
  describe "#text" do
48
48
  it "should return the text from the selected page" do
49
49
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
50
- pdf[1].text.should == "Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f"
50
+ expect(pdf[1].text).to \
51
+ eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
51
52
  end
52
53
 
53
54
  it "works with full path to pdftotext" do
54
55
  pdftotext_path = `which pdftotext`.chomp
55
56
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"), pdftotext_path: pdftotext_path)
56
- pdf[1].text.should == "Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f"
57
+ expect(pdf[1].text).to \
58
+ eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
57
59
  end
58
60
  end
59
61
  end
@@ -4,23 +4,26 @@ require 'spec_helper'
4
4
  describe Grim::Pdf do
5
5
 
6
6
  it "should have a path" do
7
- Grim::Pdf.new(fixture_path("smoker.pdf")).path.should == fixture_path("smoker.pdf")
7
+ pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
8
+ expect(pdf.path).to eq(fixture_path("smoker.pdf"))
8
9
  end
9
10
 
10
11
  describe "#initialize" do
11
12
  it "should raise an error if pdf does not exist" do
12
- lambda { Grim::Pdf.new(fixture_path("booboo.pdf")) }.should raise_error(Grim::PdfNotFound)
13
+ expect {
14
+ Grim::Pdf.new(fixture_path("booboo.pdf"))
15
+ }.to raise_error(Grim::PdfNotFound)
13
16
  end
14
17
 
15
18
  it "should set path on pdf" do
16
19
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
17
- pdf.path.should == fixture_path("smoker.pdf")
20
+ expect(pdf.path).to eq(fixture_path("smoker.pdf"))
18
21
  end
19
22
  end
20
23
 
21
24
  describe "#count" do
22
25
  it "should call Grim.processor.count with pdf path" do
23
- Grim.processor.should_receive(:count).with(fixture_path("smoker.pdf"))
26
+ expect(Grim.processor).to receive(:count).with(fixture_path("smoker.pdf"))
24
27
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
25
28
  pdf.count
26
29
  end
@@ -32,19 +35,19 @@ describe Grim::Pdf do
32
35
  end
33
36
 
34
37
  it "should raise Grim::PageDoesNotExist if page doesn't exist" do
35
- lambda { @pdf[25] }.should raise_error(Grim::PageNotFound)
38
+ expect { @pdf[25] }.to raise_error(Grim::PageNotFound)
36
39
  end
37
40
 
38
41
  it "should return an instance of Grim::Page if page exists" do
39
- @pdf[24].class.should == Grim::Page
42
+ expect(@pdf[24].class).to eq(Grim::Page)
40
43
  end
41
44
  end
42
45
 
43
46
  describe "#each" do
44
47
  it "should be iterable" do
45
48
  pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
46
- pdf.map {|p| p.number }.should == (1..25).to_a
49
+ expect(pdf.map {|p| p.number }).to eq((1..25).to_a)
47
50
  end
48
51
  end
49
52
 
50
- end
53
+ end
@@ -3,32 +3,32 @@ require 'spec_helper'
3
3
 
4
4
  describe Grim do
5
5
  it "should have a default processor" do
6
- Grim.processor.class.should == Grim::ImageMagickProcessor
6
+ expect(Grim.processor.class).to eq(Grim::ImageMagickProcessor)
7
7
  end
8
8
 
9
9
  it "should have a VERSION constant" do
10
- Grim.const_defined?('VERSION').should be_true
10
+ expect(Grim.const_defined?('VERSION')).to be(true)
11
11
  end
12
12
 
13
13
  it "should have WIDTH constant set to 1024" do
14
- Grim::WIDTH.should == 1024
14
+ expect(Grim::WIDTH).to eq(1024)
15
15
  end
16
16
 
17
17
  it "should have QUALITY constant set to 90" do
18
- Grim::QUALITY.should == 90
18
+ expect(Grim::QUALITY).to eq(90)
19
19
  end
20
20
 
21
21
  it "should have DENSITY constant set to 300" do
22
- Grim::DENSITY.should == 300
22
+ expect(Grim::DENSITY).to eq(300)
23
23
  end
24
24
 
25
25
  it "should have COLORSPACE constant set to 'RGB'" do
26
- Grim::COLORSPACE.should == 'RGB'
26
+ expect(Grim::COLORSPACE).to eq('RGB')
27
27
  end
28
28
 
29
29
  describe "#reap" do
30
30
  it "should return an instance of Grim::Pdf" do
31
- Grim.reap(fixture_path("smoker.pdf")).class.should == Grim::Pdf
31
+ expect(Grim.reap(fixture_path("smoker.pdf")).class).to eq(Grim::Pdf)
32
32
  end
33
33
  end
34
- end
34
+ end
@@ -2,6 +2,7 @@
2
2
  require 'benchmark'
3
3
  require 'rubygems'
4
4
  require 'bundler/setup'
5
+ require 'rbconfig'
5
6
 
6
7
  require 'grim'
7
8
 
@@ -28,4 +29,5 @@ end
28
29
 
29
30
  RSpec.configure do |config|
30
31
  config.include(FileHelpers)
32
+ config.filter_run_excluding :windows => true if RbConfig::CONFIG['host_os'].match(/mswin|mingw|cygwin/) == nil
31
33
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: grim
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jonathan Hoyt
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-11-25 00:00:00.000000000 Z
11
+ date: 2015-02-19 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: Grim is a simple gem for extracting a page from a pdf and converting
14
14
  it to an image as well as extract the text from the page as a string. It basically
@@ -23,7 +23,7 @@ files:
23
23
  - ".gitignore"
24
24
  - Gemfile
25
25
  - LICENSE
26
- - README.textile
26
+ - README.md
27
27
  - Rakefile
28
28
  - grim.gemspec
29
29
  - lib/grim.rb
@@ -33,6 +33,7 @@ files:
33
33
  - lib/grim/pdf.rb
34
34
  - lib/grim/version.rb
35
35
  - lib/pdf_info.ps
36
+ - spec/fixtures/remove_alpha.pdf
36
37
  - spec/fixtures/smoker.pdf
37
38
  - spec/fixtures/unprocessable.pdf
38
39
  - spec/lib/grim/image_magick_processor_spec.rb
@@ -65,6 +66,7 @@ signing_key:
65
66
  specification_version: 4
66
67
  summary: Extract slides and text from a PDF.
67
68
  test_files:
69
+ - spec/fixtures/remove_alpha.pdf
68
70
  - spec/fixtures/smoker.pdf
69
71
  - spec/fixtures/unprocessable.pdf
70
72
  - spec/lib/grim/image_magick_processor_spec.rb
@@ -1,81 +0,0 @@
1
- <pre>
2
- ,____
3
- |---.\
4
- ___ | `
5
- / .-\ ./=)
6
- | |"|_/\/|
7
- ; |-;| /_|
8
- / \_| |/ \ |
9
- / \/\( |
10
- | / |` ) |
11
- / \ _/ |
12
- /--._/ \ |
13
- `/|) | /
14
- / | |
15
- .' | |
16
- / \ |
17
- (_.-.__.__./ /
18
- </pre>
19
-
20
- h1. Grim
21
-
22
- Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
23
-
24
- h2. Prerequisites
25
-
26
- You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using "Homebrew":http://mxcl.github.com/homebrew/ to get them installed.
27
-
28
- <pre><code>
29
- brew install ghostscript imagemagick poppler
30
- </code></pre>
31
-
32
- h2. Installation
33
-
34
- <pre><code>
35
- gem install grim
36
- </code></pre>
37
-
38
- h2. Usage
39
-
40
- <pre><code>
41
- pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
42
- count = pdf.count # returns the number of pages in the pdf
43
- png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
44
- text = pdf[3].text # returns text as a String
45
-
46
- pdf.each do |page|
47
- puts page.text
48
- end
49
- </pre></code>
50
-
51
- We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
52
-
53
- <pre><code>
54
- # specifying one processor with specific ImageMagick and GhostScript paths
55
- Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
56
-
57
- # multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
58
- Grim.processor = Grim::MultiProcessor.new([
59
- Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
60
- Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
61
- ])
62
-
63
- pdf = Grim.reap('/path/to/pdf)
64
- </code></pre>
65
-
66
- h2. Reference
67
-
68
- * "jonmagic.com: Grim":http://jonmagic.com/blog/archives/2011/09/06/grim/
69
- * "jonmagic.com: Grim MultiProcessor":http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/
70
-
71
- h2. Contributors
72
-
73
- * "@jonmagic":https://github.com/jonmagic
74
- * "@jnunemaker":https://github.com/jnunemaker
75
- * "@bryckbost":https://github.com/bryckbost
76
- * "@bkeepers":https://github.com/bkeepers
77
- * "@BobaFaux":https://github.com/BobaFaux
78
-
79
- h2. License
80
-
81
- See LICENSE for details.