grim 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile +1 -1
- data/README.md +106 -0
- data/lib/grim/image_magick_processor.rb +25 -13
- data/lib/grim/version.rb +1 -1
- data/spec/fixtures/remove_alpha.pdf +0 -0
- data/spec/lib/grim/image_magick_processor_spec.rb +34 -8
- data/spec/lib/grim/multi_processor_spec.rb +11 -11
- data/spec/lib/grim/page_spec.rb +8 -6
- data/spec/lib/grim/pdf_spec.rb +11 -8
- data/spec/lib/grim_spec.rb +8 -8
- data/spec/spec_helper.rb +2 -0
- metadata +5 -3
- data/README.textile +0 -81
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 75e501e9b8b7daf4549a07bf29f3e34baa5099f1
|
4
|
+
data.tar.gz: 1a96dc37be69e9474305329c74b4a556b20d2a1d
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 82ee2f6442b015ccd7716467ac18d8f0081e6741e72a36fb3ee48032287b9915785dbae78ff4e74a182bc0b801b55228f462f2898839a45d9b6f73683b9b8f3a
|
7
|
+
data.tar.gz: 54356b62ff9d386dedf2107d70efbba276d4be24c7168c190b8f9bddcf04b0dcc9b36186de49e3804e53ae0fa0832eed26cc4cbe48b103837f50dbed00211ecf
|
data/Gemfile
CHANGED
data/README.md
ADDED
@@ -0,0 +1,106 @@
|
|
1
|
+
```
|
2
|
+
,____
|
3
|
+
|---.\
|
4
|
+
___ | `
|
5
|
+
/ .-\ ./=)
|
6
|
+
| |"|_/\/|
|
7
|
+
; |-;| /_|
|
8
|
+
/ \_| |/ \ |
|
9
|
+
/ \/\( |
|
10
|
+
| / |` ) |
|
11
|
+
/ \ _/ |
|
12
|
+
/--._/ \ |
|
13
|
+
`/|) | /
|
14
|
+
/ | |
|
15
|
+
.' | |
|
16
|
+
/ \ |
|
17
|
+
(_.-.__.__./ /
|
18
|
+
```
|
19
|
+
|
20
|
+
# Grim
|
21
|
+
|
22
|
+
Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
|
23
|
+
|
24
|
+
## Prerequisites
|
25
|
+
|
26
|
+
You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using [Homebrew](http://mxcl.github.com/homebrew/) to get them installed.
|
27
|
+
|
28
|
+
```bash
|
29
|
+
$ brew install ghostscript imagemagick poppler
|
30
|
+
```
|
31
|
+
|
32
|
+
## Installation
|
33
|
+
|
34
|
+
```bash
|
35
|
+
$ gem install grim
|
36
|
+
```
|
37
|
+
|
38
|
+
## Usage
|
39
|
+
|
40
|
+
```ruby
|
41
|
+
pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
|
42
|
+
count = pdf.count # returns the number of pages in the pdf
|
43
|
+
png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
|
44
|
+
text = pdf[3].text # returns text as a String
|
45
|
+
|
46
|
+
pdf.each do |page|
|
47
|
+
puts page.text
|
48
|
+
end
|
49
|
+
```
|
50
|
+
|
51
|
+
We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
|
52
|
+
|
53
|
+
```ruby
|
54
|
+
# specifying one processor with specific ImageMagick and GhostScript paths
|
55
|
+
Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
|
56
|
+
|
57
|
+
# multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
|
58
|
+
Grim.processor = Grim::MultiProcessor.new([
|
59
|
+
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
|
60
|
+
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
|
61
|
+
])
|
62
|
+
|
63
|
+
pdf = Grim.reap('/path/to/pdf')
|
64
|
+
```
|
65
|
+
|
66
|
+
You can even specify a Windows executable :zap:
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
# specifying another ghostscript executable, win64 in this example
|
70
|
+
# the ghostscript/bin folder still has to be in the PATH for this to work
|
71
|
+
Grim.processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
|
72
|
+
|
73
|
+
pdf = Grim.reap('/path/to/pdf')
|
74
|
+
```
|
75
|
+
|
76
|
+
`Grim::ImageMagickProcessor#save` supports several options as well:
|
77
|
+
|
78
|
+
```ruby
|
79
|
+
pdf = Grim.reap("/path/to/pdf")
|
80
|
+
pdf[0].save('/path/to/image.png', {
|
81
|
+
:width => 600, # defaults to 1024
|
82
|
+
:density => 72, # defaults to 300
|
83
|
+
:quality => 60, # defaults to 90
|
84
|
+
:colorspace => "CMYK", # defaults to "RGB"
|
85
|
+
:alpha => "Activate" # not used when not set
|
86
|
+
})
|
87
|
+
```
|
88
|
+
|
89
|
+
## Reference
|
90
|
+
|
91
|
+
* [jonmagic.com: Grim](http://jonmagic.com/blog/archives/2011/09/06/grim/)
|
92
|
+
* [jonmagic.com: Grim MultiProcessor](http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/)
|
93
|
+
|
94
|
+
## Contributors
|
95
|
+
|
96
|
+
* [@jonmagic](https://github.com/jonmagic)
|
97
|
+
* [@jnunemaker](https://github.com/jnunemaker)
|
98
|
+
* [@bryckbost](https://github.com/bryckbost)
|
99
|
+
* [@bkeepers](https://github.com/bkeepers)
|
100
|
+
* [@BobaFaux](https://github.com/BobaFaux)
|
101
|
+
* [@Rubikan](https://github.com/Rubikan)
|
102
|
+
* [@victormier](https://github.com/victormier)
|
103
|
+
|
104
|
+
## License
|
105
|
+
|
106
|
+
See [LICENSE](LICENSE) for details.
|
@@ -3,32 +3,44 @@ module Grim
|
|
3
3
|
|
4
4
|
# ghostscript prints out a warning, this regex matches it
|
5
5
|
WarningRegex = /\*\*\*\*.*\n/
|
6
|
+
DefaultImagemagickPath = 'convert'
|
7
|
+
DefaultGhostScriptPath = 'gs'
|
6
8
|
|
7
9
|
def initialize(options={})
|
8
|
-
@imagemagick_path = options[:imagemagick_path] ||
|
9
|
-
@ghostscript_path = options[:ghostscript_path]
|
10
|
-
@original_path
|
10
|
+
@imagemagick_path = options[:imagemagick_path] || DefaultImagemagickPath
|
11
|
+
@ghostscript_path = options[:ghostscript_path] || DefaultGhostScriptPath
|
12
|
+
@original_path = ENV['PATH']
|
11
13
|
end
|
12
14
|
|
13
15
|
def count(path)
|
14
|
-
command = ["-dNODISPLAY", "-q",
|
16
|
+
command = [@ghostscript_path, "-dNODISPLAY", "-q",
|
15
17
|
"-sFile=#{Shellwords.shellescape(path)}",
|
16
18
|
File.expand_path('../../../lib/pdf_info.ps', __FILE__)]
|
17
|
-
@ghostscript_path ? command.unshift(@ghostscript_path) : command.unshift('gs')
|
18
19
|
result = `#{command.join(' ')}`
|
19
20
|
result.gsub(WarningRegex, '').to_i
|
20
21
|
end
|
21
22
|
|
22
23
|
def save(pdf, index, path, options)
|
23
|
-
width
|
24
|
-
density
|
25
|
-
quality
|
24
|
+
width = options.fetch(:width, Grim::WIDTH)
|
25
|
+
density = options.fetch(:density, Grim::DENSITY)
|
26
|
+
quality = options.fetch(:quality, Grim::QUALITY)
|
26
27
|
colorspace = options.fetch(:colorspace, Grim::COLORSPACE)
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
command
|
28
|
+
alpha = options[:alpha]
|
29
|
+
|
30
|
+
command = []
|
31
|
+
command << @imagemagick_path
|
32
|
+
command << "-resize #{width}"
|
33
|
+
command << "-alpha #{alpha}" if alpha
|
34
|
+
command << "-antialias"
|
35
|
+
command << "-render"
|
36
|
+
command << "-quality #{quality}"
|
37
|
+
command << "-colorspace #{colorspace}"
|
38
|
+
command << "-interlace none"
|
39
|
+
command << "-density #{density}"
|
40
|
+
command << "#{Shellwords.shellescape(pdf.path)}[#{index}]"
|
41
|
+
command << path
|
42
|
+
|
43
|
+
command.unshift("PATH=#{File.dirname(@ghostscript_path)}:#{ENV['PATH']}") if @ghostscript_path && @ghostscript_path != DefaultGhostScriptPath
|
32
44
|
|
33
45
|
result = `#{command.join(' ')}`
|
34
46
|
|
data/lib/grim/version.rb
CHANGED
Binary file
|
@@ -16,7 +16,17 @@ describe Grim::ImageMagickProcessor do
|
|
16
16
|
end
|
17
17
|
|
18
18
|
it "should return page count" do
|
19
|
-
@processor.count(fixture_path("smoker.pdf")).
|
19
|
+
expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
|
20
|
+
end
|
21
|
+
end
|
22
|
+
|
23
|
+
describe "#count with windows executable", :windows => true do
|
24
|
+
before(:each) do
|
25
|
+
@processor = Grim::ImageMagickProcessor.new({:ghostscript_path => "gswin64c.exe"})
|
26
|
+
end
|
27
|
+
|
28
|
+
it "should return page count" do
|
29
|
+
expect(@processor.count(fixture_path("smoker.pdf"))).to eq(25)
|
20
30
|
end
|
21
31
|
end
|
22
32
|
|
@@ -30,13 +40,13 @@ describe Grim::ImageMagickProcessor do
|
|
30
40
|
|
31
41
|
it "should create the file" do
|
32
42
|
@processor.save(@pdf, 0, @path, {})
|
33
|
-
File.exist?(@path).
|
43
|
+
expect(File.exist?(@path)).to be(true)
|
34
44
|
end
|
35
45
|
|
36
46
|
it "should use default width of 1024" do
|
37
47
|
@processor.save(@pdf, 0, @path, {})
|
38
48
|
width, height = dimensions_for_path(@path)
|
39
|
-
width.
|
49
|
+
expect(width).to eq(1024)
|
40
50
|
end
|
41
51
|
end
|
42
52
|
|
@@ -50,7 +60,7 @@ describe Grim::ImageMagickProcessor do
|
|
50
60
|
|
51
61
|
it "should set width" do
|
52
62
|
width, height = dimensions_for_path(@path)
|
53
|
-
width.
|
63
|
+
expect(width).to eq(20)
|
54
64
|
end
|
55
65
|
end
|
56
66
|
|
@@ -67,7 +77,7 @@ describe Grim::ImageMagickProcessor do
|
|
67
77
|
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:quality => 90})
|
68
78
|
higher_size = File.size(@path)
|
69
79
|
|
70
|
-
(lower_size < higher_size).
|
80
|
+
expect(lower_size < higher_size).to be(true)
|
71
81
|
end
|
72
82
|
end
|
73
83
|
|
@@ -81,7 +91,7 @@ describe Grim::ImageMagickProcessor do
|
|
81
91
|
lower_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 72}) }
|
82
92
|
higher_time = Benchmark.realtime { Grim::ImageMagickProcessor.new.save(@pdf, 0, @path, {:density => 300}) }
|
83
93
|
|
84
|
-
(lower_time < higher_time).
|
94
|
+
expect(lower_time < higher_time).to be(true)
|
85
95
|
end
|
86
96
|
end
|
87
97
|
|
@@ -99,7 +109,23 @@ describe Grim::ImageMagickProcessor do
|
|
99
109
|
file1_size = File.stat(@path1).size
|
100
110
|
file2_size = File.stat(@path2).size
|
101
111
|
|
102
|
-
file1_size.
|
112
|
+
expect(file1_size).to_not eq(file2_size)
|
113
|
+
end
|
114
|
+
end
|
115
|
+
|
116
|
+
describe "#save with alpha option" do
|
117
|
+
before(:each) do
|
118
|
+
@path1 = tmp_path("to_png_spec-1.png")
|
119
|
+
@path2 = tmp_path("to_png_spec-2.png")
|
120
|
+
@pdf = Grim::Pdf.new(fixture_path("remove_alpha.pdf"))
|
121
|
+
end
|
122
|
+
|
123
|
+
it "should use alpha" do
|
124
|
+
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path1, {:alpha => 'Set'})
|
125
|
+
Grim::ImageMagickProcessor.new.save(@pdf, 0, @path2, {:alpha => 'Remove'})
|
126
|
+
|
127
|
+
expect(`convert #{@path1} -verbose info:`.include?("alpha: 8-bit")).to be(true)
|
128
|
+
expect(`convert #{@path2} -verbose info:`.include?("alpha: 1-bit")).to be(true)
|
103
129
|
end
|
104
130
|
end
|
105
|
-
end
|
131
|
+
end
|
@@ -14,9 +14,9 @@ describe Grim::MultiProcessor do
|
|
14
14
|
|
15
15
|
describe "#count" do
|
16
16
|
it "should try processors until it succeeds" do
|
17
|
-
@failure.
|
18
|
-
@success.
|
19
|
-
@extra.
|
17
|
+
allow(@failure).to receive(:count).and_return("")
|
18
|
+
expect(@success).to receive(:count).and_return(30)
|
19
|
+
expect(@extra).to_not receive(:count)
|
20
20
|
|
21
21
|
@processor.count(@path)
|
22
22
|
end
|
@@ -24,19 +24,19 @@ describe Grim::MultiProcessor do
|
|
24
24
|
|
25
25
|
describe "#save" do
|
26
26
|
it "should try processors until it succeeds" do
|
27
|
-
@failure.
|
28
|
-
@success.
|
29
|
-
@extra.
|
27
|
+
allow(@failure).to receive(:save).and_return(false)
|
28
|
+
expect(@success).to receive(:save).and_return(true)
|
29
|
+
expect(@extra).to_not receive(:save)
|
30
30
|
|
31
31
|
@processor.save(@pdf, 0, @path, {})
|
32
32
|
end
|
33
33
|
|
34
34
|
it "should raise error if all processors fail" do
|
35
|
-
@failure.
|
36
|
-
@success.
|
37
|
-
@extra.
|
35
|
+
expect(@failure).to receive(:save).and_return(false)
|
36
|
+
expect(@success).to receive(:save).and_return(false)
|
37
|
+
expect(@extra).to receive(:save).and_return(false)
|
38
38
|
|
39
|
-
|
39
|
+
expect { @processor.save(@pdf, 0, @path, {}) }.to raise_error(Grim::UnprocessablePage)
|
40
40
|
end
|
41
41
|
end
|
42
|
-
end
|
42
|
+
end
|
data/spec/lib/grim/page_spec.rb
CHANGED
@@ -8,7 +8,7 @@ describe Grim::Page do
|
|
8
8
|
end
|
9
9
|
|
10
10
|
it "should have number" do
|
11
|
-
Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number.
|
11
|
+
expect(Grim::Page.new(Grim::Pdf.new(fixture_path("smoker.pdf")), 1).number).to eq(2)
|
12
12
|
end
|
13
13
|
|
14
14
|
describe "#save" do
|
@@ -18,7 +18,7 @@ describe Grim::Page do
|
|
18
18
|
end
|
19
19
|
|
20
20
|
it "should call Grim.processor.save with pdf, index, path, and options" do
|
21
|
-
Grim.processor.
|
21
|
+
expect(Grim.processor).to receive(:save).with(@pdf, 0, @path, {})
|
22
22
|
@pdf[0].save(@path)
|
23
23
|
end
|
24
24
|
end
|
@@ -30,8 +30,8 @@ describe Grim::Page do
|
|
30
30
|
end
|
31
31
|
|
32
32
|
it "raises an exception" do
|
33
|
-
|
34
|
-
|
33
|
+
expect { @pdf[0].save(nil) }.to raise_error(Grim::PathMissing)
|
34
|
+
expect { @pdf[0].save(' ') }.to raise_error(Grim::PathMissing)
|
35
35
|
end
|
36
36
|
end
|
37
37
|
|
@@ -47,13 +47,15 @@ describe Grim::Page do
|
|
47
47
|
describe "#text" do
|
48
48
|
it "should return the text from the selected page" do
|
49
49
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
50
|
-
pdf[1].text.
|
50
|
+
expect(pdf[1].text).to \
|
51
|
+
eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
|
51
52
|
end
|
52
53
|
|
53
54
|
it "works with full path to pdftotext" do
|
54
55
|
pdftotext_path = `which pdftotext`.chomp
|
55
56
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"), pdftotext_path: pdftotext_path)
|
56
|
-
pdf[1].text.
|
57
|
+
expect(pdf[1].text).to \
|
58
|
+
eq("Step 1: get someone to print this curve for you to scale, 72” wide\nStep 2: Get a couple 55 gallon drums\n\n\f")
|
57
59
|
end
|
58
60
|
end
|
59
61
|
end
|
data/spec/lib/grim/pdf_spec.rb
CHANGED
@@ -4,23 +4,26 @@ require 'spec_helper'
|
|
4
4
|
describe Grim::Pdf do
|
5
5
|
|
6
6
|
it "should have a path" do
|
7
|
-
Grim::Pdf.new(fixture_path("smoker.pdf"))
|
7
|
+
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
8
|
+
expect(pdf.path).to eq(fixture_path("smoker.pdf"))
|
8
9
|
end
|
9
10
|
|
10
11
|
describe "#initialize" do
|
11
12
|
it "should raise an error if pdf does not exist" do
|
12
|
-
|
13
|
+
expect {
|
14
|
+
Grim::Pdf.new(fixture_path("booboo.pdf"))
|
15
|
+
}.to raise_error(Grim::PdfNotFound)
|
13
16
|
end
|
14
17
|
|
15
18
|
it "should set path on pdf" do
|
16
19
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
17
|
-
pdf.path.
|
20
|
+
expect(pdf.path).to eq(fixture_path("smoker.pdf"))
|
18
21
|
end
|
19
22
|
end
|
20
23
|
|
21
24
|
describe "#count" do
|
22
25
|
it "should call Grim.processor.count with pdf path" do
|
23
|
-
Grim.processor.
|
26
|
+
expect(Grim.processor).to receive(:count).with(fixture_path("smoker.pdf"))
|
24
27
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
25
28
|
pdf.count
|
26
29
|
end
|
@@ -32,19 +35,19 @@ describe Grim::Pdf do
|
|
32
35
|
end
|
33
36
|
|
34
37
|
it "should raise Grim::PageDoesNotExist if page doesn't exist" do
|
35
|
-
|
38
|
+
expect { @pdf[25] }.to raise_error(Grim::PageNotFound)
|
36
39
|
end
|
37
40
|
|
38
41
|
it "should return an instance of Grim::Page if page exists" do
|
39
|
-
@pdf[24].class.
|
42
|
+
expect(@pdf[24].class).to eq(Grim::Page)
|
40
43
|
end
|
41
44
|
end
|
42
45
|
|
43
46
|
describe "#each" do
|
44
47
|
it "should be iterable" do
|
45
48
|
pdf = Grim::Pdf.new(fixture_path("smoker.pdf"))
|
46
|
-
pdf.map {|p| p.number }.
|
49
|
+
expect(pdf.map {|p| p.number }).to eq((1..25).to_a)
|
47
50
|
end
|
48
51
|
end
|
49
52
|
|
50
|
-
end
|
53
|
+
end
|
data/spec/lib/grim_spec.rb
CHANGED
@@ -3,32 +3,32 @@ require 'spec_helper'
|
|
3
3
|
|
4
4
|
describe Grim do
|
5
5
|
it "should have a default processor" do
|
6
|
-
Grim.processor.class.
|
6
|
+
expect(Grim.processor.class).to eq(Grim::ImageMagickProcessor)
|
7
7
|
end
|
8
8
|
|
9
9
|
it "should have a VERSION constant" do
|
10
|
-
Grim.const_defined?('VERSION').
|
10
|
+
expect(Grim.const_defined?('VERSION')).to be(true)
|
11
11
|
end
|
12
12
|
|
13
13
|
it "should have WIDTH constant set to 1024" do
|
14
|
-
Grim::WIDTH.
|
14
|
+
expect(Grim::WIDTH).to eq(1024)
|
15
15
|
end
|
16
16
|
|
17
17
|
it "should have QUALITY constant set to 90" do
|
18
|
-
Grim::QUALITY.
|
18
|
+
expect(Grim::QUALITY).to eq(90)
|
19
19
|
end
|
20
20
|
|
21
21
|
it "should have DENSITY constant set to 300" do
|
22
|
-
Grim::DENSITY.
|
22
|
+
expect(Grim::DENSITY).to eq(300)
|
23
23
|
end
|
24
24
|
|
25
25
|
it "should have COLORSPACE constant set to 'RGB'" do
|
26
|
-
Grim::COLORSPACE.
|
26
|
+
expect(Grim::COLORSPACE).to eq('RGB')
|
27
27
|
end
|
28
28
|
|
29
29
|
describe "#reap" do
|
30
30
|
it "should return an instance of Grim::Pdf" do
|
31
|
-
Grim.reap(fixture_path("smoker.pdf")).class.
|
31
|
+
expect(Grim.reap(fixture_path("smoker.pdf")).class).to eq(Grim::Pdf)
|
32
32
|
end
|
33
33
|
end
|
34
|
-
end
|
34
|
+
end
|
data/spec/spec_helper.rb
CHANGED
@@ -2,6 +2,7 @@
|
|
2
2
|
require 'benchmark'
|
3
3
|
require 'rubygems'
|
4
4
|
require 'bundler/setup'
|
5
|
+
require 'rbconfig'
|
5
6
|
|
6
7
|
require 'grim'
|
7
8
|
|
@@ -28,4 +29,5 @@ end
|
|
28
29
|
|
29
30
|
RSpec.configure do |config|
|
30
31
|
config.include(FileHelpers)
|
32
|
+
config.filter_run_excluding :windows => true if RbConfig::CONFIG['host_os'].match(/mswin|mingw|cygwin/) == nil
|
31
33
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: grim
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Hoyt
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2015-02-19 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
13
|
description: Grim is a simple gem for extracting a page from a pdf and converting
|
14
14
|
it to an image as well as extract the text from the page as a string. It basically
|
@@ -23,7 +23,7 @@ files:
|
|
23
23
|
- ".gitignore"
|
24
24
|
- Gemfile
|
25
25
|
- LICENSE
|
26
|
-
- README.
|
26
|
+
- README.md
|
27
27
|
- Rakefile
|
28
28
|
- grim.gemspec
|
29
29
|
- lib/grim.rb
|
@@ -33,6 +33,7 @@ files:
|
|
33
33
|
- lib/grim/pdf.rb
|
34
34
|
- lib/grim/version.rb
|
35
35
|
- lib/pdf_info.ps
|
36
|
+
- spec/fixtures/remove_alpha.pdf
|
36
37
|
- spec/fixtures/smoker.pdf
|
37
38
|
- spec/fixtures/unprocessable.pdf
|
38
39
|
- spec/lib/grim/image_magick_processor_spec.rb
|
@@ -65,6 +66,7 @@ signing_key:
|
|
65
66
|
specification_version: 4
|
66
67
|
summary: Extract slides and text from a PDF.
|
67
68
|
test_files:
|
69
|
+
- spec/fixtures/remove_alpha.pdf
|
68
70
|
- spec/fixtures/smoker.pdf
|
69
71
|
- spec/fixtures/unprocessable.pdf
|
70
72
|
- spec/lib/grim/image_magick_processor_spec.rb
|
data/README.textile
DELETED
@@ -1,81 +0,0 @@
|
|
1
|
-
<pre>
|
2
|
-
,____
|
3
|
-
|---.\
|
4
|
-
___ | `
|
5
|
-
/ .-\ ./=)
|
6
|
-
| |"|_/\/|
|
7
|
-
; |-;| /_|
|
8
|
-
/ \_| |/ \ |
|
9
|
-
/ \/\( |
|
10
|
-
| / |` ) |
|
11
|
-
/ \ _/ |
|
12
|
-
/--._/ \ |
|
13
|
-
`/|) | /
|
14
|
-
/ | |
|
15
|
-
.' | |
|
16
|
-
/ \ |
|
17
|
-
(_.-.__.__./ /
|
18
|
-
</pre>
|
19
|
-
|
20
|
-
h1. Grim
|
21
|
-
|
22
|
-
Grim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
|
23
|
-
|
24
|
-
h2. Prerequisites
|
25
|
-
|
26
|
-
You will need ghostscript, imagemagick, and poppler installed. On the Mac (OSX) I highly recommend using "Homebrew":http://mxcl.github.com/homebrew/ to get them installed.
|
27
|
-
|
28
|
-
<pre><code>
|
29
|
-
brew install ghostscript imagemagick poppler
|
30
|
-
</code></pre>
|
31
|
-
|
32
|
-
h2. Installation
|
33
|
-
|
34
|
-
<pre><code>
|
35
|
-
gem install grim
|
36
|
-
</code></pre>
|
37
|
-
|
38
|
-
h2. Usage
|
39
|
-
|
40
|
-
<pre><code>
|
41
|
-
pdf = Grim.reap("/path/to/pdf") # returns Grim::Pdf instance for pdf
|
42
|
-
count = pdf.count # returns the number of pages in the pdf
|
43
|
-
png = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not
|
44
|
-
text = pdf[3].text # returns text as a String
|
45
|
-
|
46
|
-
pdf.each do |page|
|
47
|
-
puts page.text
|
48
|
-
end
|
49
|
-
</pre></code>
|
50
|
-
|
51
|
-
We also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).
|
52
|
-
|
53
|
-
<pre><code>
|
54
|
-
# specifying one processor with specific ImageMagick and GhostScript paths
|
55
|
-
Grim.processor = Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/convert", :ghostscript_path => "/path/to/gs"})
|
56
|
-
|
57
|
-
# multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs
|
58
|
-
Grim.processor = Grim::MultiProcessor.new([
|
59
|
-
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.7/convert", :ghostscript_path => "/path/to/9.04/gs"}),
|
60
|
-
Grim::ImageMagickProcessor.new({:imagemagick_path => "/path/to/6.6/convert", :ghostscript_path => "/path/to/9.02/gs"})
|
61
|
-
])
|
62
|
-
|
63
|
-
pdf = Grim.reap('/path/to/pdf)
|
64
|
-
</code></pre>
|
65
|
-
|
66
|
-
h2. Reference
|
67
|
-
|
68
|
-
* "jonmagic.com: Grim":http://jonmagic.com/blog/archives/2011/09/06/grim/
|
69
|
-
* "jonmagic.com: Grim MultiProcessor":http://jonmagic.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/
|
70
|
-
|
71
|
-
h2. Contributors
|
72
|
-
|
73
|
-
* "@jonmagic":https://github.com/jonmagic
|
74
|
-
* "@jnunemaker":https://github.com/jnunemaker
|
75
|
-
* "@bryckbost":https://github.com/bryckbost
|
76
|
-
* "@bkeepers":https://github.com/bkeepers
|
77
|
-
* "@BobaFaux":https://github.com/BobaFaux
|
78
|
-
|
79
|
-
h2. License
|
80
|
-
|
81
|
-
See LICENSE for details.
|