combine_pdf 0.1.5 → 0.1.6
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.gitignore +14 -0
- data/CHANGELOG.md +15 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +22 -0
- data/README.md +117 -0
- data/Rakefile +2 -0
- data/combine_pdf.gemspec +25 -0
- data/lib/combine_pdf.rb +3 -3
- data/lib/combine_pdf/combine_pdf_basic_writer.rb +7 -1
- data/lib/combine_pdf/combine_pdf_fonts.rb +7 -1
- data/lib/combine_pdf/combine_pdf_operations.rb +1 -1
- data/lib/combine_pdf/version.rb +3 -0
- metadata +42 -7
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a3cb89ac3e41d8582aeee1a3805ff69081b3c334
|
4
|
+
data.tar.gz: e0d9e8e57a96d2dc02fe671e3b7cffef7880bfed
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: adbe88e6e502ad73499eff03b69dbece9d046d7147fdec3ca25aa161c7e9b18adc66a76db021b3a5e5585957c9541386442f6cc46f2016724598ec40e44fbcff
|
7
|
+
data.tar.gz: 24be2b53ea8582ebb0cba3e48f4dbd350d67992cc4eb81f32ee48e4a1b1bda4e06c8d6d96a7e4e374097f73cb2a19f7a49b834334faa624cfe195517e60d2fae
|
data/.gitignore
ADDED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
#Change Log
|
2
|
+
|
3
|
+
***
|
4
|
+
|
5
|
+
Change log v.0.1.6
|
6
|
+
|
7
|
+
**fix**: added Mutex to font library (which was shared by all PDFWriter objects) - now fonts are thread safe (PDF objects are NOT thread safe by design).
|
8
|
+
|
9
|
+
**fix**: RTL recognition did not reverse brackets, it should now correctly perform brackets reversal for any of the following: (,),[,],{,},<,>.
|
10
|
+
|
11
|
+
**update**: updated license to MIT.
|
12
|
+
|
13
|
+
**known issues**: encrypted PDF files can sometimes silently fail (producing empty pages) - this is because on an attempted decrypt. more work should be done to support encrypted PDF files. please feel fee to help.
|
14
|
+
|
15
|
+
I use this version on production, where I have control over the PDF files I use. It is beter then system calls to pdftk (which can cause all threads in ruby to hold, effectively causing my web app to hang).
|
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2014 Myst
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,117 @@
|
|
1
|
+
# CombinePDF - the ruby way for merging PDF files
|
2
|
+
CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).
|
3
|
+
|
4
|
+
# Install
|
5
|
+
|
6
|
+
Install with ruby gems:
|
7
|
+
```
|
8
|
+
gem install combine_pdf
|
9
|
+
```
|
10
|
+
|
11
|
+
## Combine/Merge PDF files or Pages
|
12
|
+
To combine PDF files (or data):
|
13
|
+
```ruby
|
14
|
+
pdf = CombinePDF.new
|
15
|
+
pdf << CombinePDF.new("file1.pdf") # one way to combine, very fast.
|
16
|
+
pdf << CombinePDF.new("file2.pdf")
|
17
|
+
pdf.save "combined.pdf"
|
18
|
+
```
|
19
|
+
Or even a one liner:
|
20
|
+
```ruby
|
21
|
+
(CombinePDF.new("file1.pdf") << CombinePDF.new("file2.pdf") << CombinePDF.new("file3.pdf")).save("combined.pdf")
|
22
|
+
```
|
23
|
+
you can also add just odd or even pages:
|
24
|
+
```ruby
|
25
|
+
pdf = CombinePDF.new
|
26
|
+
i = 0
|
27
|
+
CombinePDF.new("file.pdf").pages.each do |page|
|
28
|
+
i += 1
|
29
|
+
pdf << page if i.even?
|
30
|
+
end
|
31
|
+
pdf.save "even_pages.pdf"
|
32
|
+
```
|
33
|
+
|
34
|
+
notice that adding all the pages one by one is slower then adding the whole file.
|
35
|
+
## Add content to existing pages (Stamp / Watermark)
|
36
|
+
|
37
|
+
To add content to existing PDF pages, first import the new content from an existing PDF file. After that, add the content to each of the pages in your existing PDF.
|
38
|
+
|
39
|
+
In this example, we will add a company logo to each page:
|
40
|
+
```ruby
|
41
|
+
company_logo = CombinePDF.new("company_logo.pdf").pages[0]
|
42
|
+
pdf = CombinePDF.new "content_file.pdf"
|
43
|
+
pdf.pages.each {|page| page << company_logo} # notice the << operator is on a page and not a PDF object.
|
44
|
+
pdf.save "content_with_logo.pdf"
|
45
|
+
```
|
46
|
+
Notice the << operator is on a page and not a PDF object. The << operator acts differently on PDF objects and on Pages.
|
47
|
+
|
48
|
+
The << operator defaults to secure injection by renaming references to avoid conflics. For overlaying pages using compressed data that might not be editable (due to limited filter support), you can use:
|
49
|
+
```ruby
|
50
|
+
pdf.pages(nil, false).each {|page| page << stamp_page}
|
51
|
+
```
|
52
|
+
## Page Numbering
|
53
|
+
adding page numbers to a PDF object or file is as simple as can be:
|
54
|
+
```ruby
|
55
|
+
pdf = CombinePDF.new "file_to_number.pdf"
|
56
|
+
pdf.number_pages
|
57
|
+
pdf.save "file_with_numbering.pdf"
|
58
|
+
```
|
59
|
+
Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - see documentation.
|
60
|
+
|
61
|
+
## Loading PDF data
|
62
|
+
Loading PDF data can be done from file system or directly from the memory.
|
63
|
+
|
64
|
+
Loading data from a file is easy:
|
65
|
+
```ruby
|
66
|
+
pdf = CombinePDF.new("file.pdf")
|
67
|
+
```
|
68
|
+
you can also parse PDF files from memory:
|
69
|
+
```ruby
|
70
|
+
pdf_data = IO.read 'file.pdf' # for this demo, load a file to memory
|
71
|
+
pdf = CombinePDF.parse(pdf_data)
|
72
|
+
```
|
73
|
+
Loading from the memory is especially effective for importing PDF data recieved through the internet or from a different authoring library such as Prawn.
|
74
|
+
|
75
|
+
Demo
|
76
|
+
====
|
77
|
+
|
78
|
+
You can see a Demo for a ["Bates stumping web-app"](http://nameless-gorge-3596.herokuapp.com/bates) and read through it's [code](http://nameless-gorge-3596.herokuapp.com/code) . Good luck :)
|
79
|
+
|
80
|
+
Decryption & Filters
|
81
|
+
====================
|
82
|
+
|
83
|
+
Some PDF files are encrypted and some are compressed (the use of filters)...
|
84
|
+
|
85
|
+
There is very little support for encrypted files and very very basic and limited support for compressed files.
|
86
|
+
|
87
|
+
I need help with that.
|
88
|
+
|
89
|
+
Comments and file structure
|
90
|
+
===========================
|
91
|
+
|
92
|
+
If you want to help with the code, please be aware:
|
93
|
+
|
94
|
+
I'm a self learned hobbiest at heart. The documentation is lacking and the comments in the code are poor guidlines.
|
95
|
+
|
96
|
+
The code itself should be very straight forward, but feel free to ask whatever you want.
|
97
|
+
|
98
|
+
Credit
|
99
|
+
======
|
100
|
+
|
101
|
+
Caige Nichols wrote an amazing RC4 gem which I used in my code.
|
102
|
+
|
103
|
+
I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
|
104
|
+
|
105
|
+
Credit to his wonderful is given here. Please respect his license and copyright... and mine.
|
106
|
+
|
107
|
+
License
|
108
|
+
=======
|
109
|
+
MIT
|
110
|
+
|
111
|
+
|
112
|
+
|
113
|
+
|
114
|
+
|
115
|
+
|
116
|
+
|
117
|
+
|
data/Rakefile
ADDED
data/combine_pdf.gemspec
ADDED
@@ -0,0 +1,25 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'combine_pdf/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "combine_pdf"
|
8
|
+
spec.version = CombinePdf::VERSION
|
9
|
+
spec.authors = ["Boaz Segev"]
|
10
|
+
spec.email = ["We try, we fail, we do, we are"]
|
11
|
+
spec.summary = %q{Combine, stamp and watermark PDF files in pure Ruby.}
|
12
|
+
spec.description = %q{A nifty gem, in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, number the pages, watermark them or stamp them, create tables or basic text objects etc` (all using the PDF file format).}
|
13
|
+
spec.homepage = "https://github.com/boazsegev/combine_pdf"
|
14
|
+
spec.license = "MIT"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0")
|
17
|
+
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
18
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_runtime_dependency 'ruby-rc4', '>= 0.1.5'
|
22
|
+
|
23
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
24
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
25
|
+
end
|
data/lib/combine_pdf.rb
CHANGED
@@ -1,5 +1,4 @@
|
|
1
1
|
# -*- encoding : utf-8 -*-
|
2
|
-
# use under GPLv3 terms only
|
3
2
|
|
4
3
|
require 'zlib'
|
5
4
|
require 'securerandom'
|
@@ -16,6 +15,7 @@ load "combine_pdf/combine_pdf_fonts.rb"
|
|
16
15
|
load "combine_pdf/combine_pdf_filter.rb"
|
17
16
|
load "combine_pdf/combine_pdf_parser.rb"
|
18
17
|
load "combine_pdf/combine_pdf_pdf.rb"
|
18
|
+
require "combine_pdf/version"
|
19
19
|
|
20
20
|
|
21
21
|
# This is a pure ruby library to combine/merge, stmap/overlay and number PDF files - as well as to create tables (ment for indexing combined files).
|
@@ -101,7 +101,7 @@ load "combine_pdf/combine_pdf_pdf.rb"
|
|
101
101
|
#
|
102
102
|
# == License
|
103
103
|
#
|
104
|
-
#
|
104
|
+
# MIT
|
105
105
|
module CombinePDF
|
106
106
|
module_function
|
107
107
|
|
@@ -288,7 +288,7 @@ end
|
|
288
288
|
|
289
289
|
#########################################################
|
290
290
|
# this file is part of the CombinePDF library and the code
|
291
|
-
# is subject to the same license (
|
291
|
+
# is subject to the same license (MIT).
|
292
292
|
#########################################################
|
293
293
|
# PDF object types cross reference:
|
294
294
|
# Indirect objects, references, dictionaries and streams are Hash
|
@@ -403,6 +403,11 @@ module CombinePDF
|
|
403
403
|
# ...still, it works (I think).
|
404
404
|
def reorder_rtl_content text
|
405
405
|
rtl_characters = "\u05d0-\u05ea\u05f0-\u05f4\u0600-\u06ff\u0750-\u077f"
|
406
|
+
rtl_replaces = { '(' => ')', ')' => '(',
|
407
|
+
'[' => ']', ']'=>'[',
|
408
|
+
'{' => '}', '}'=>'{',
|
409
|
+
'<' => '>', '>'=>'<',
|
410
|
+
}
|
406
411
|
return text unless text =~ /[#{rtl_characters}]/
|
407
412
|
|
408
413
|
out = []
|
@@ -412,10 +417,11 @@ module CombinePDF
|
|
412
417
|
out.unshift scanner.matched
|
413
418
|
elsif scanner.scan /[^#{rtl_characters}]+/
|
414
419
|
if out.empty? && scanner.matched.match(/[\s]$/) && !scanner.eos?
|
415
|
-
warn "MOVING SPACE: #{scanner.matched}"
|
416
420
|
white_space_to_move = scanner.matched.match(/[\s]+$/).to_s
|
417
421
|
out.unshift scanner.matched[0..-1-white_space_to_move.length]
|
418
422
|
out.unshift white_space_to_move
|
423
|
+
elsif scanner.matched.match /^[\(\)\[\]\{\}\<\>]$/
|
424
|
+
out.unshift rtl_replaces[scanner.matched]
|
419
425
|
else
|
420
426
|
out.unshift scanner.matched
|
421
427
|
end
|
@@ -103,7 +103,9 @@ module CombinePDF
|
|
103
103
|
new_font.cmap = font_cmap
|
104
104
|
new_font[:is_reference_only] = true
|
105
105
|
new_font[:referenced_object] = font_pdf_object
|
106
|
-
|
106
|
+
FONTS_LIBRARY_MUTEX.synchronize do
|
107
|
+
FONTS_LIBRARY[new_font.name] = new_font
|
108
|
+
end
|
107
109
|
new_font
|
108
110
|
end
|
109
111
|
|
@@ -342,6 +344,10 @@ module CombinePDF
|
|
342
344
|
# the Hash listing all the fonts.
|
343
345
|
FONTS_LIBRARY = {}
|
344
346
|
|
347
|
+
# the Mutex for library write access
|
348
|
+
|
349
|
+
FONTS_LIBRARY_MUTEX = Mutex.new
|
350
|
+
|
345
351
|
|
346
352
|
# this method parses a cmap file using it's data stream
|
347
353
|
# FIXME:
|
@@ -392,7 +392,7 @@ end
|
|
392
392
|
|
393
393
|
#########################################################
|
394
394
|
# this file is part of the CombinePDF library and the code
|
395
|
-
# is subject to the same license (
|
395
|
+
# is subject to the same license (MIT).
|
396
396
|
#########################################################
|
397
397
|
# PDF object types cross reference:
|
398
398
|
# Indirect objects, references, dictionaries and streams are Hash
|
metadata
CHANGED
@@ -1,15 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: combine_pdf
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.6
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Boaz Segev
|
8
|
-
- Masters of the open source community
|
9
8
|
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date: 2014-
|
11
|
+
date: 2014-10-26 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: ruby-rc4
|
@@ -25,14 +24,50 @@ dependencies:
|
|
25
24
|
- - ">="
|
26
25
|
- !ruby/object:Gem::Version
|
27
26
|
version: 0.1.5
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: bundler
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '1.7'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '1.7'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rake
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '10.0'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '10.0'
|
28
55
|
description: A nifty gem, in pure Ruby, to parse PDF files and combine (merge) them
|
29
56
|
with other PDF files, number the pages, watermark them or stamp them, create tables
|
30
57
|
or basic text objects etc` (all using the PDF file format).
|
31
|
-
email:
|
58
|
+
email:
|
59
|
+
- We try, we fail, we do, we are
|
32
60
|
executables: []
|
33
61
|
extensions: []
|
34
62
|
extra_rdoc_files: []
|
35
63
|
files:
|
64
|
+
- ".gitignore"
|
65
|
+
- CHANGELOG.md
|
66
|
+
- Gemfile
|
67
|
+
- LICENSE.txt
|
68
|
+
- README.md
|
69
|
+
- Rakefile
|
70
|
+
- combine_pdf.gemspec
|
36
71
|
- lib/combine_pdf.rb
|
37
72
|
- lib/combine_pdf/combine_pdf_basic_writer.rb
|
38
73
|
- lib/combine_pdf/combine_pdf_decrypt.rb
|
@@ -41,9 +76,10 @@ files:
|
|
41
76
|
- lib/combine_pdf/combine_pdf_operations.rb
|
42
77
|
- lib/combine_pdf/combine_pdf_parser.rb
|
43
78
|
- lib/combine_pdf/combine_pdf_pdf.rb
|
79
|
+
- lib/combine_pdf/version.rb
|
44
80
|
homepage: https://github.com/boazsegev/combine_pdf
|
45
81
|
licenses:
|
46
|
-
-
|
82
|
+
- MIT
|
47
83
|
metadata: {}
|
48
84
|
post_install_message:
|
49
85
|
rdoc_options: []
|
@@ -53,7 +89,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
53
89
|
requirements:
|
54
90
|
- - ">="
|
55
91
|
- !ruby/object:Gem::Version
|
56
|
-
version:
|
92
|
+
version: '0'
|
57
93
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
58
94
|
requirements:
|
59
95
|
- - ">="
|
@@ -66,4 +102,3 @@ signing_key:
|
|
66
102
|
specification_version: 4
|
67
103
|
summary: Combine, stamp and watermark PDF files in pure Ruby.
|
68
104
|
test_files: []
|
69
|
-
has_rdoc:
|