grim 1.1.0 → 1.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile +1 -1
- data/README.md +106 -0
- data/lib/grim/image_magick_processor.rb +25 -13
- data/lib/grim/version.rb +1 -1
- data/spec/fixtures/remove_alpha.pdf +0 -0
- data/spec/lib/grim/image_magick_processor_spec.rb +34 -8
- data/spec/lib/grim/multi_processor_spec.rb +11 -11
- data/spec/lib/grim/page_spec.rb +8 -6
- data/spec/lib/grim/pdf_spec.rb +11 -8
- data/spec/lib/grim_spec.rb +8 -8
- data/spec/spec_helper.rb +2 -0
- metadata +5 -3
- data/README.textile +0 -81
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 75e501e9b8b7daf4549a07bf29f3e34baa5099f1
|
4
|
+
data.tar.gz: 1a96dc37be69e9474305329c74b4a556b20d2a1d
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 82ee2f6442b015ccd7716467ac18d8f0081e6741e72a36fb3ee48032287b9915785dbae78ff4e74a182bc0b801b55228f462f2898839a45d9b6f73683b9b8f3a
|
7
|
+
data.tar.gz: 54356b62ff9d386dedf2107d70efbba276d4be24c7168c190b8f9bddcf04b0dcc9b36186de49e3804e53ae0fa0832eed26cc4cbe48b103837f50dbed00211ecf
|
data/Gemfile
CHANGED
data/README.md
ADDED
@@ -0,0 +1,106 @@
|
|
1
|
+
```
|
2
|
+
,____
|
3
|
+
|---.\
|
4
|
+
___ | `
|
5
|
+
/ .-\ ./=)
|
6
|
+
| |"|_/\/|
|
7
|
+
; |-;| /_|
|
8
|
+
/ \_| |/ \ |
|
9
|
+
/ \/\( |
|
10
|
+
| / |` ) |
|
11
|
+
/ \ _/ |
|
12
|
+
/--._/ \ |
|
13
|
+
`/|) | /
|
14
|
+
/ | |
|
15
|
+
.' | |
|
16
|
+
/ \ |
|
17
|
+
(_.-.__.__./ /
|
18
|
+
```
|
19
|
+
|
20
|
+
# Grim
|
21
|
+
|
22
|
+
Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
|
23
|
+
|
24
|
+
## Prerequisites
|
25
|
+
|
26
|
+
You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using [Homebrew](http://mxcl.github.com/homebrew/) to get them installed.
|
27
|
+
|
28
|
+
```bash
|
29
|
+
$ brew install ghostscript imagemagick poppler
|
30
|
+
```
|
31
|
+
|
32
|
+
## Installation
|
33
|
+
|
34
|
+
```bash
|
35
|
+
$ gem install grim
|
36
|
+
```
|
37
|
+
|
38
|
+
## Usage
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
|
42
|
+
count = pdf.count # returns the number of pages in the pdf
|
43
|
+
png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
|
44
|
+
text = pdf[3].text # returns text as a String
|
45
|
+
|
46
|
+
pdf.each do |page|
|
47
|
+
puts page.text
|
48
|
+
end
|
49
|
+
```
|
50
|
+
|
51
|
+
We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
|
52
|
+
|
53
|
+
```ruby
|
54
|
+
# specifying one processor with specific ImageMagick and GhostScript paths
|
55
|
+
Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
|
56
|
+
|
57
|
+
# multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
|
58
|
+
Grim.processor = Grim::MultiProcessor.new([
|
59
|
+
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
|
60
|
+
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
|
61
|
+
])
|
62
|
+
|
63
|
+
pdf = Grim.reap('/path/to/pdf')
|
64
|
+
```
|
65
|
+
|
66
|
+
You can even specify a Windows executable :zap:
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
# specifying another ghostscript executable, win64 in this example
|
70
|
+
# the ghostscript/bin folder still has to be in the PATH for this to work
|
71
|
+
Grim.processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
|
72
|
+
|
73
|
+
pdf = Grim.reap('/path/to/pdf')
|
74
|
+
```
|
75
|
+
|
76
|
+
`Grim::ImageMagickProcessor#save` supports several options as well:
|
77
|
+
|
78
|
+
```ruby
|
79
|
+
pdf = Grim.reap("/path/to/pdf")
|
80
|
+
pdf[0].save('/path/to/image.png', {
|
81
|
+
:width => 600, # defaults to 1024
|
82
|
+
:density => 72, # defaults to 300
|
83
|
+
:quality => 60, # defaults to 90
|
84
|
+
:colorspace => "CMYK", # defaults to "RGB"
|
85
|
+
:alpha => "Activate" # not used when not set
|
86
|
+
})
|
87
|
+
```
|
88
|
+
|
89
|
+
## Reference
|
90
|
+
|
91
|
+
* [jonmagic.com: Grim](http://jonmagic.com/blog/archives/2011/09/06/grim/)
|
92
|
+
* [jonmagic.com: Grim MultiProcessor](http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/)
|
93
|
+
|
94
|
+
## Contributors
|
95
|
+
|
96
|
+
* [@jonmagic](https://github.com/jonmagic)
|
97
|
+
* [@jnunemaker](https://github.com/jnunemaker)
|
98
|
+
* [@bryckbost](https://github.com/bryckbost)
|
99
|
+
* [@bkeepers](https://github.com/bkeepers)
|
100
|
+
* [@BobaFaux](https://github.com/BobaFaux)
|
101
|
+
* [@Rubikan](https://github.com/Rubikan)
|
102
|
+
* [@victormier](https://github.com/victormier)
|
103
|
+
|
104
|
+
## License
|
105
|
+
|
106
|
+
See [LICENSE](LICENSE) for details.
|
@@ -3,32 +3,44 @@ module Grim
|
|
3
3
|
|
4
4
|
# ghostscript prints out a warning, this regex matches it
|
5
5
|
WarningRegex = /\*\*\*\*.*\n/
|
6
|
+
DefaultImagemagickPath = 'convert'
|
7
|
+
DefaultGhostScriptPath = 'gs'
|
6
8
|
|
7
9
|
def initialize(options={})
|
8
|
-
@imagemagick_path = options[:imagemagick_path] ||
|
9
|
-
@ghostscript_path = options[:ghostscript_path]
|
10
|
-
@original_path
|
10
|
+
@imagemagick_path = options[:imagemagick_path] || DefaultImagemagickPath
|
11
|
+
@ghostscript_path = options[:ghostscript_path] || DefaultGhostScriptPath
|
12
|
+
@original_path = ENV['PATH']
|
11
13
|
end
|
12
14
|
|
13
15
|
def count(path)
|
14
|
-
command = ["-dNODISPLAY", "-q",
|
16
|
+
command = [@ghostscript_path, "-dNODISPLAY", "-q",
|
15
17
|
"-sFile=#{Shellwords.shellescape(path)}",
|
16
18
|
File.expand_path('../../../lib/pdf_info.ps', __FILE__)]
|
17
|
-
@ghostscript_path ? command.unshift(@ghostscript_path) : command.unshift('gs')
|
18
19
|
result = `#{command.join(' ')}`
|
19
20
|
result.gsub(WarningRegex, '').to_i
|
20
21
|
end
|
21
22
|
|
22
23
|
def save(pdf, index, path, options)
|
23
|
-
width
|
24
|
-
density
|
25
|
-
quality
|
24
|
+
width = options.fetch(:width, Grim::WIDTH)
|
25
|
+
density = options.fetch(:density, Grim::DENSITY)
|
26
|
+
quality = options.fetch(:quality, Grim::QUALITY)
|
26
27
|
colorspace = options.fetch(:colorspace, Grim::COLORSPACE)
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
command
|
28
|
+
alpha = options[:alpha]
|
29
|
+
|
30
|
+
command = []
|
31
|
+
command << @imagemagick_path
|
32
|
+
command << "-resize #{width}"
|
33
|
+
command << "-alpha #{alpha}" if alpha
|
34
|
+
command << "-antialias"
|
35
|
+
command << "-render"
|
36
|
+
command << "-quality #{quality}"
|
37
|
+
command << "-colorspace #{colorspace}"
|
38
|
+
command << "-interlace none"
|
39
|
+
command << "-density #{density}"
|
40
|
+
command << "#{Shellwords.shellescape(pdf.path)}[#{index}]"
|
41
|
+
command << path
|
42
|
+
|
43
|
+
command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path && @ghostscript_path != DefaultGhostScriptPath
|
32
44
|
|
33
45
|
result = `#{command.join(' ')}`
|
34
46
|
|
data/lib/grim/version.rb
CHANGED
Binary file
|
@@ -16,7 +16,17 @@ describe Grim::ImageMagickProcessor do
|
|
16
16
|
end
|
17
17
|
|
18
18
|
it "should return page count" do
|
19
|
-
@processor.count(fixture_path("smoker.pdf")).
|
19
|
+
expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
|
20
|
+
end
|
21
|
+
end
|
22
|
+
|
23
|
+
describe "#count with windows executable", :windows => true do
|
24
|
+
before(:each) do
|
25
|
+
@processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
|
26
|
+
end
|
27
|
+
|
28
|
+
it "should return page count" do
|
29
|
+
expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
|
20
30
|
end
|
21
31
|
end
|
22
32
|
|
@@ -30,13 +40,13 @@ describe Grim::ImageMagickProcessor do
|
|
30
40
|
|
31
41
|
it "should create the file" do
|
32
42
|
@processor.save(@pdf, 0, @path, {})
|
33
|
-
File.exist?(@path).
|
43
|
+
expect(File.exist?(@path)).to be(true)
|
34
44
|
end
|
35
45
|
|
36
46
|
it "should use default width of 1024" do
|
37
47
|
@processor.save(@pdf, 0, @path, {})
|
38
48
|
width, height = dimensions_for_path(@path)
|
39
|
-
width.
|
49
|
+
expect(width).to eq(1024)
|
40
50
|
end
|
41
51
|
end
|
42
52
|
|
@@ -50,7 +60,7 @@ describe Grim::ImageMagickProcessor do
|
|
50
60
|
|
51
61
|
it "should set width" do
|
52
62
|
width, height = dimensions_for_path(@path)
|
53
|
-
width.
|
63
|
+
expect(width).to eq(20)
|
54
64
|
end
|
55
65
|
end
|
56
66
|
|
@@ -67,7 +77,7 @@ describe Grim::ImageMagickProcessor do
|
|
67
77
|
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:quality => 90})
|
68
78
|
higher_size = File.size(@path)
|
69
79
|
|
70
|
-
(lower_size < higher_size).
|
80
|
+
expect(lower_size < higher_size).to be(true)
|
71
81
|
end
|
72
82
|
end
|
73
83
|
|
@@ -81,7 +91,7 @@ describe Grim::ImageMagickProcessor do
|
|
81
91
|
lower_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 72}) }
|
82
92
|
higher_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 300}) }
|
83
93
|
|
84
|
-
(lower_time < higher_time).
|
94
|
+
expect(lower_time < higher_time).to be(true)
|
85
95
|
end
|
86
96
|
end
|
87
97
|
|
@@ -99,7 +109,23 @@ describe Grim::ImageMagickProcessor do
|
|
99
109
|
file1_size = File.stat(@path1).size
|
100
110
|
file2_size = File.stat(@path2).size
|
101
111
|
|
102
|
-
file1_size.
|
112
|
+
expect(file1_size).to_not eq(file2_size)
|
113
|
+
end
|
114
|
+
end
|
115
|
+
|
116
|
+
describe "#save with alpha option" do
|
117
|
+
before(:each) do
|
118
|
+
@path1 = tmp_path("to_png_spec-1.png")
|
119
|
+
@path2 = tmp_path("to_png_spec-2.png")
|
120
|
+
@pdf = Grim::Pdf.new(fixture_path("remove_alpha.pdf"))
|
121
|
+
end
|
122
|
+
|
123
|
+
it "should use alpha" do
|
124
|
+
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path1, {:alpha => 'Set'})
|
125
|
+
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path2, {:alpha => 'Remove'})
|
126
|
+
|
127
|
+
expect(`convert #{@path1} -verbose info:`.include?("alpha: 8-bit")).to be(true)
|
128
|
+
expect(`convert #{@path2} -verbose info:`.include?("alpha: 1-bit")).to be(true)
|
103
129
|
end
|
104
130
|
end
|
105
|
-
end
|
131
|
+
end
|
@@ -14,9 +14,9 @@ describe Grim::MultiProcessor do
|
|
14
14
|
|
15
15
|
describe "#count" do
|
16
16
|
it "should try processors until it succeeds" do
|
17
|
-
@failure.
|
18
|
-
@success.
|
19
|
-
@extra.
|
17
|
+
allow(@failure).to receive(:count).and_return("")
|
18
|
+
expect(@success).to receive(:count).and_return(30)
|
19
|
+
expect(@extra).to_not receive(:count)
|
20
20
|
|
21
21
|
@processor.count(@path)
|
22
22
|
end
|
@@ -24,19 +24,19 @@ describe Grim::MultiProcessor do
|
|
24
24
|
|
25
25
|
describe "#save" do
|
26
26
|
it "should try processors until it succeeds" do
|
27
|
-
@failure.
|
28
|
-
@success.
|
29
|
-
@extra.
|
27
|
+
allow(@failure).to receive(:save).and_return(false)
|
28
|
+
expect(@success).to receive(:save).and_return(true)
|
29
|
+
expect(@extra).to_not receive(:save)
|
30
30
|
|
31
31
|
@processor.save(@pdf, 0, @path, {})
|
32
32
|
end
|
33
33
|
|
34
34
|
it "should raise error if all processors fail" do
|
35
|
-
@failure.
|
36
|
-
@success.
|
37
|
-
@extra.
|
35
|
+
expect(@failure).to receive(:save).and_return(false)
|
36
|
+
expect(@success).to receive(:save).and_return(false)
|
37
|
+
expect(@extra).to receive(:save).and_return(false)
|
38
38
|
|
39
|
-
|
39
|
+
expect { @processor.save(@pdf, 0, @path, {}) }.to raise_error(Grim::UnprocessablePage)
|
40
40
|
end
|
41
41
|
end
|
42
|
-
end
|
42
|
+
end
|
data/spec/lib/grim/page_spec.rb
CHANGED
@@ -8,7 +8,7 @@ describe Grim::Page do
|
|
8
8
|
end
|
9
9
|
|
10
10
|
it "should have number" do
|
11
|
-
Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number.
|
11
|
+
expect(Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number).to eq(2)
|
12
12
|
end
|
13
13
|
|
14
14
|
describe "#save" do
|
@@ -18,7 +18,7 @@ describe Grim::Page do
|
|
18
18
|
end
|
19
19
|
|
20
20
|
it "should call Grim.processor.save with pdf, index, path, and options" do
|
21
|
-
Grim.processor.
|
21
|
+
expect(Grim.processor).to receive(:save).with(@pdf, 0, @path, {})
|
22
22
|
@pdf[0].save(@path)
|
23
23
|
end
|
24
24
|
end
|
@@ -30,8 +30,8 @@ describe Grim::Page do
|
|
30
30
|
end
|
31
31
|
|
32
32
|
it "raises an exception" do
|
33
|
-
|
34
|
-
|
33
|
+
expect { @pdf[0].save(nil) }.to raise_error(Grim::PathMissing)
|
34
|
+
expect { @pdf[0].save(' ') }.to raise_error(Grim::PathMissing)
|
35
35
|
end
|
36
36
|
end
|
37
37
|
|
@@ -47,13 +47,15 @@ describe Grim::Page do
|
|
47
47
|
describe "#text" do
|
48
48
|
it "should return the text from the selected page" do
|
49
49
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
50
|
-
pdf[1].text.
|
50
|
+
expect(pdf[1].text).to \
|
51
|
+
eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
|
51
52
|
end
|
52
53
|
|
53
54
|
it "works with full path to pdftotext" do
|
54
55
|
pdftotext_path = `which pdftotext`.chomp
|
55
56
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"), pdftotext_path: pdftotext_path)
|
56
|
-
pdf[1].text.
|
57
|
+
expect(pdf[1].text).to \
|
58
|
+
eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
|
57
59
|
end
|
58
60
|
end
|
59
61
|
end
|
data/spec/lib/grim/pdf_spec.rb
CHANGED
@@ -4,23 +4,26 @@ require 'spec_helper'
|
|
4
4
|
describe Grim::Pdf do
|
5
5
|
|
6
6
|
it "should have a path" do
|
7
|
-
Grim::Pdf.new(fixture_path("smoker.pdf"))
|
7
|
+
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
8
|
+
expect(pdf.path).to eq(fixture_path("smoker.pdf"))
|
8
9
|
end
|
9
10
|
|
10
11
|
describe "#initialize" do
|
11
12
|
it "should raise an error if pdf does not exist" do
|
12
|
-
|
13
|
+
expect {
|
14
|
+
Grim::Pdf.new(fixture_path("booboo.pdf"))
|
15
|
+
}.to raise_error(Grim::PdfNotFound)
|
13
16
|
end
|
14
17
|
|
15
18
|
it "should set path on pdf" do
|
16
19
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
17
|
-
pdf.path.
|
20
|
+
expect(pdf.path).to eq(fixture_path("smoker.pdf"))
|
18
21
|
end
|
19
22
|
end
|
20
23
|
|
21
24
|
describe "#count" do
|
22
25
|
it "should call Grim.processor.count with pdf path" do
|
23
|
-
Grim.processor.
|
26
|
+
expect(Grim.processor).to receive(:count).with(fixture_path("smoker.pdf"))
|
24
27
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
25
28
|
pdf.count
|
26
29
|
end
|
@@ -32,19 +35,19 @@ describe Grim::Pdf do
|
|
32
35
|
end
|
33
36
|
|
34
37
|
it "should raise Grim::PageDoesNotExist if page doesn't exist" do
|
35
|
-
|
38
|
+
expect { @pdf[25] }.to raise_error(Grim::PageNotFound)
|
36
39
|
end
|
37
40
|
|
38
41
|
it "should return an instance of Grim::Page if page exists" do
|
39
|
-
@pdf[24].class.
|
42
|
+
expect(@pdf[24].class).to eq(Grim::Page)
|
40
43
|
end
|
41
44
|
end
|
42
45
|
|
43
46
|
describe "#each" do
|
44
47
|
it "should be iterable" do
|
45
48
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
46
|
-
pdf.map {|p| p.number }.
|
49
|
+
expect(pdf.map {|p| p.number }).to eq((1..25).to_a)
|
47
50
|
end
|
48
51
|
end
|
49
52
|
|
50
|
-
end
|
53
|
+
end
|
data/spec/lib/grim_spec.rb
CHANGED
@@ -3,32 +3,32 @@ require 'spec_helper'
|
|
3
3
|
|
4
4
|
describe Grim do
|
5
5
|
it "should have a default processor" do
|
6
|
-
Grim.processor.class.
|
6
|
+
expect(Grim.processor.class).to eq(Grim::ImageMagickProcessor)
|
7
7
|
end
|
8
8
|
|
9
9
|
it "should have a VERSION constant" do
|
10
|
-
Grim.const_defined?('VERSION').
|
10
|
+
expect(Grim.const_defined?('VERSION')).to be(true)
|
11
11
|
end
|
12
12
|
|
13
13
|
it "should have WIDTH constant set to 1024" do
|
14
|
-
Grim::WIDTH.
|
14
|
+
expect(Grim::WIDTH).to eq(1024)
|
15
15
|
end
|
16
16
|
|
17
17
|
it "should have QUALITY constant set to 90" do
|
18
|
-
Grim::QUALITY.
|
18
|
+
expect(Grim::QUALITY).to eq(90)
|
19
19
|
end
|
20
20
|
|
21
21
|
it "should have DENSITY constant set to 300" do
|
22
|
-
Grim::DENSITY.
|
22
|
+
expect(Grim::DENSITY).to eq(300)
|
23
23
|
end
|
24
24
|
|
25
25
|
it "should have COLORSPACE constant set to 'RGB'" do
|
26
|
-
Grim::COLORSPACE.
|
26
|
+
expect(Grim::COLORSPACE).to eq('RGB')
|
27
27
|
end
|
28
28
|
|
29
29
|
describe "#reap" do
|
30
30
|
it "should return an instance of Grim::Pdf" do
|
31
|
-
Grim.reap(fixture_path("smoker.pdf")).class.
|
31
|
+
expect(Grim.reap(fixture_path("smoker.pdf")).class).to eq(Grim::Pdf)
|
32
32
|
end
|
33
33
|
end
|
34
|
-
end
|
34
|
+
end
|
data/spec/spec_helper.rb
CHANGED
@@ -2,6 +2,7 @@
|
|
2
2
|
require 'benchmark'
|
3
3
|
require 'rubygems'
|
4
4
|
require 'bundler/setup'
|
5
|
+
require 'rbconfig'
|
5
6
|
|
6
7
|
require 'grim'
|
7
8
|
|
@@ -28,4 +29,5 @@ end
|
|
28
29
|
|
29
30
|
RSpec.configure do |config|
|
30
31
|
config.include(FileHelpers)
|
32
|
+
config.filter_run_excluding :windows => true if RbConfig::CONFIG['host_os'].match(/mswin|mingw|cygwin/) == nil
|
31
33
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: grim
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Hoyt
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2015-02-19 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
13
|
description: Grim is a simple gem for extracting a page from a pdf and converting
|
14
14
|
it to an image as well as extract the text from the page as a string. It basically
|
@@ -23,7 +23,7 @@ files:
|
|
23
23
|
- ".gitignore"
|
24
24
|
- Gemfile
|
25
25
|
- LICENSE
|
26
|
-
- README.
|
26
|
+
- README.md
|
27
27
|
- Rakefile
|
28
28
|
- grim.gemspec
|
29
29
|
- lib/grim.rb
|
@@ -33,6 +33,7 @@ files:
|
|
33
33
|
- lib/grim/pdf.rb
|
34
34
|
- lib/grim/version.rb
|
35
35
|
- lib/pdf_info.ps
|
36
|
+
- spec/fixtures/remove_alpha.pdf
|
36
37
|
- spec/fixtures/smoker.pdf
|
37
38
|
- spec/fixtures/unprocessable.pdf
|
38
39
|
- spec/lib/grim/image_magick_processor_spec.rb
|
@@ -65,6 +66,7 @@ signing_key:
|
|
65
66
|
specification_version: 4
|
66
67
|
summary: Extract slides and text from a PDF.
|
67
68
|
test_files:
|
69
|
+
- spec/fixtures/remove_alpha.pdf
|
68
70
|
- spec/fixtures/smoker.pdf
|
69
71
|
- spec/fixtures/unprocessable.pdf
|
70
72
|
- spec/lib/grim/image_magick_processor_spec.rb
|
data/README.textile
DELETED
@@ -1,81 +0,0 @@
|
|
1
|
-
<pre>
|
2
|
-
,____
|
3
|
-
|---.\
|
4
|
-
___ | `
|
5
|
-
/ .-\ ./=)
|
6
|
-
| |"|_/\/|
|
7
|
-
; |-;| /_|
|
8
|
-
/ \_| |/ \ |
|
9
|
-
/ \/\( |
|
10
|
-
| / |` ) |
|
11
|
-
/ \ _/ |
|
12
|
-
/--._/ \ |
|
13
|
-
`/|) | /
|
14
|
-
/ | |
|
15
|
-
.' | |
|
16
|
-
/ \ |
|
17
|
-
(_.-.__.__./ /
|
18
|
-
</pre>
|
19
|
-
|
20
|
-
h1. Grim
|
21
|
-
|
22
|
-
Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
|
23
|
-
|
24
|
-
h2. Prerequisites
|
25
|
-
|
26
|
-
You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using "Homebrew":http://mxcl.github.com/homebrew/ to get them installed.
|
27
|
-
|
28
|
-
<pre><code>
|
29
|
-
brew install ghostscript imagemagick poppler
|
30
|
-
</code></pre>
|
31
|
-
|
32
|
-
h2. Installation
|
33
|
-
|
34
|
-
<pre><code>
|
35
|
-
gem install grim
|
36
|
-
</code></pre>
|
37
|
-
|
38
|
-
h2. Usage
|
39
|
-
|
40
|
-
<pre><code>
|
41
|
-
pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
|
42
|
-
count = pdf.count # returns the number of pages in the pdf
|
43
|
-
png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
|
44
|
-
text = pdf[3].text # returns text as a String
|
45
|
-
|
46
|
-
pdf.each do |page|
|
47
|
-
puts page.text
|
48
|
-
end
|
49
|
-
</pre></code>
|
50
|
-
|
51
|
-
We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
|
52
|
-
|
53
|
-
<pre><code>
|
54
|
-
# specifying one processor with specific ImageMagick and GhostScript paths
|
55
|
-
Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
|
56
|
-
|
57
|
-
# multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
|
58
|
-
Grim.processor = Grim::MultiProcessor.new([
|
59
|
-
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
|
60
|
-
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
|
61
|
-
])
|
62
|
-
|
63
|
-
pdf = Grim.reap('/path/to/pdf)
|
64
|
-
</code></pre>
|
65
|
-
|
66
|
-
h2. Reference
|
67
|
-
|
68
|
-
* "jonmagic.com: Grim":http://jonmagic.com/blog/archives/2011/09/06/grim/
|
69
|
-
* "jonmagic.com: Grim MultiProcessor":http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/
|
70
|
-
|
71
|
-
h2. Contributors
|
72
|
-
|
73
|
-
* "@jonmagic":https://github.com/jonmagic
|
74
|
-
* "@jnunemaker":https://github.com/jnunemaker
|
75
|
-
* "@bryckbost":https://github.com/bryckbost
|
76
|
-
* "@bkeepers":https://github.com/bkeepers
|
77
|
-
* "@BobaFaux":https://github.com/BobaFaux
|
78
|
-
|
79
|
-
h2. License
|
80
|
-
|
81
|
-
See LICENSE for details.
|