combine_pdf 0.2.5 → 0.2.37
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.gitignore +2 -0
- data/CHANGELOG.md +273 -27
- data/LICENSE.txt +2 -1
- data/README.md +69 -4
- data/lib/combine_pdf/api.rb +156 -153
- data/lib/combine_pdf/basic_writer.rb +41 -53
- data/lib/combine_pdf/decrypt.rb +238 -228
- data/lib/combine_pdf/exceptions.rb +4 -0
- data/lib/combine_pdf/filter.rb +79 -85
- data/lib/combine_pdf/fonts.rb +451 -462
- data/lib/combine_pdf/page_methods.rb +891 -946
- data/lib/combine_pdf/parser.rb +663 -531
- data/lib/combine_pdf/pdf_protected.rb +341 -126
- data/lib/combine_pdf/pdf_public.rb +492 -454
- data/lib/combine_pdf/renderer.rb +146 -141
- data/lib/combine_pdf/version.rb +1 -2
- data/lib/combine_pdf.rb +14 -18
- data/test/automated +132 -0
- data/test/console +4 -4
- data/test/named_dest +84 -0
- metadata +8 -5
- data/lib/combine_pdf/operations.rb +0 -416
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: cac27b28f3653156374b1ea4a429676625ba0c9f
|
4
|
+
data.tar.gz: 8ce9f60a9bdcbd763a72461703c51845dbab0f2c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 78aa47281a6f9fa5723a99ed9ce666479999348b69a19778e02eb2144e1c507d4a62ac23a07e4b46079193d14fecf77cbb9dac06591ac2354e92046ba0ba5d20
|
7
|
+
data.tar.gz: 2b92948efba5ab031a46865416b13b1aecb892a1763c0d45190598313e6e0139907ece00c129349060ff030c7b38531d1d848f9696168d5fec499a4c65121db8
|
data/CHANGELOG.md
CHANGED
@@ -1,8 +1,254 @@
|
|
1
|
-
#Change Log
|
1
|
+
# Change Log
|
2
2
|
|
3
3
|
***
|
4
4
|
|
5
|
-
Change log v.0.2.
|
5
|
+
#### Change log v.0.2.37 (Release Candidate)
|
6
|
+
|
7
|
+
**Fix**: Fixed `Page_Methods#textbox` default `:x`,`:y` to allow for non-zero/cropped page origin. Credit to @donnguyen for exposing the issue.
|
8
|
+
|
9
|
+
**Fix**: Fix typo on Parser error message for general parser error. Credit to @axlekb.
|
10
|
+
|
11
|
+
***
|
12
|
+
|
13
|
+
#### Change log v.0.2.36
|
14
|
+
|
15
|
+
**Fix**: Fix for [issue #104](https://github.com/boazsegev/combine_pdf/issues/104). Credit to @tomascharad for exposing the issue.
|
16
|
+
|
17
|
+
**Release**: This gem had been using a development versioning scheme for far too long. The API is stable enough to switch to a production versioning scheme. This version is expected to be the last 0.x version. Assuming this version will be stable enough, it is expected to be re-released as v.1.0.
|
18
|
+
|
19
|
+
|
20
|
+
***
|
21
|
+
|
22
|
+
#### Change log v.0.2.35
|
23
|
+
|
24
|
+
**Update**: Updated / upgraded our RC4 and AES PDF encryption support (for non-password protected PDFs). Credit to Gyuchang Jun (@gyuchang) for his work on providing CombinePDF with this extra encryption support. I have no idea what magic he used to make this happen, but it's beautiful!
|
25
|
+
|
26
|
+
|
27
|
+
***
|
28
|
+
|
29
|
+
#### Change log v.0.2.34
|
30
|
+
|
31
|
+
**Fix**: [fixed issue #44 for wkhtmltopdf compatibility](https://github.com/boazsegev/combine_pdf/issues/44) and PDF v.1.2 use of named destinations. Credit to Devin Wadsworth (@daymun) for exposing the issue.
|
32
|
+
|
33
|
+
***
|
34
|
+
|
35
|
+
#### Change log v.0.2.33
|
36
|
+
|
37
|
+
**Update**: Fix #97 to allow javascript support for interactive objects. Credit to @joshirashmics for exposing the issue.
|
38
|
+
|
39
|
+
**Update**: Extended "named tree" support now preserves some advanced PDF feature that weren't supported before.
|
40
|
+
|
41
|
+
**Deprecation**: Ruby is deprecating `Fixnum`, as so is CombinePDF... replaced all `Fixnum` occurrences with `Integer`.
|
42
|
+
|
43
|
+
***
|
44
|
+
|
45
|
+
#### Change log v.0.2.32
|
46
|
+
|
47
|
+
**Update**: Better errors when encryption related exceptions occur. Credit to Paul Shumeika ( @pshumeika ).
|
48
|
+
|
49
|
+
**Fix**: Fixed an issue where empty pages with NULL contents value would cause CombinePDF to raise an exception when rendering. Credit to @holtmaat and Jason DeLeon (@progmem) (both in submitted different PRs regarding the issue).
|
50
|
+
|
51
|
+
***
|
52
|
+
|
53
|
+
#### Change log v.0.2.31
|
54
|
+
|
55
|
+
**Broke**: Broke the fix for issue #65 so that Radio buttons data might be lost... working on a fix.
|
56
|
+
|
57
|
+
**Fix**: Fixed issue #82 (reintroduction of issue #19 due to core engine rewrite) related to a workaround for an issue with AcrobatReader. Credit to @gyuchang for testing and helping with the fix.
|
58
|
+
|
59
|
+
**Merge**: Merged pull request #80, fixing an issue with byte decoding. Credit to @gyuchang for the PR.
|
60
|
+
|
61
|
+
**Performance**: Improved performance for the reference and duplicate object resolution. Credit to @gyuchang for pointing some optimization options.
|
62
|
+
|
63
|
+
***
|
64
|
+
|
65
|
+
#### Change log v.0.2.30
|
66
|
+
|
67
|
+
**Fix**: Fixed an issue where HTTP artifacts before the beginning of a PDF file / string would prevent the PDF from being parsed. This should fix issue #78 reported by @robvitaro.
|
68
|
+
|
69
|
+
***
|
70
|
+
|
71
|
+
#### Change log v.0.2.29
|
72
|
+
|
73
|
+
**Fix**: Fixed an issue where updating a page's rotation might raise a `NoMethodError` exception. Credit to Danny (@dikond) both for discovering the issue and for PR #77 that fixes this.
|
74
|
+
|
75
|
+
***
|
76
|
+
|
77
|
+
#### Change log v.0.2.28
|
78
|
+
|
79
|
+
**Fix**: Fixed an issue related to page stumping, which was introduced when the Rubocop beautification changed the logic of an `if` statement in the Resource merger. Credit to Leon Miller-Out (@sbleon) for noticing the issue, testing and opening PR #76.
|
80
|
+
|
81
|
+
***
|
82
|
+
|
83
|
+
#### Change log v.0.2.27
|
84
|
+
|
85
|
+
**Fix**: Fixed an issue where a `nil` outline count would cause PDF merger to fail.
|
86
|
+
|
87
|
+
**Fix**: Fixed an issue where `nil` data would cause the named destination rebuilding process to quit early, leaving some of the data unprocessed. Credit to Stefan Leitner (@sLe1tner) for exposing the issue.
|
88
|
+
|
89
|
+
**Feature**: PDF outlines are now merged and named destination links are preserved (both in the outlines and the page content). Credit to Stefan Leitner (@sLe1tner) for this feature.
|
90
|
+
|
91
|
+
***
|
92
|
+
|
93
|
+
#### Change log v.0.2.26
|
94
|
+
|
95
|
+
**Fix**: Merged PR #72, fixing a typo in the parser that caused incorrect byte substitution to corrupt certain PDF data (adversely effecting encrypted PDFs). Credit to Gyuchang Jun (@gyuchang) for the fix.
|
96
|
+
|
97
|
+
***
|
98
|
+
|
99
|
+
#### Change log v.0.2.25
|
100
|
+
|
101
|
+
**Fix**: Fixed issue #71, merging PDF outline that exist but have 0 entries fails and raises an exception. Credit to @Kagetsuki for exposing the issue.
|
102
|
+
|
103
|
+
***
|
104
|
+
|
105
|
+
#### Change log v.0.2.24
|
106
|
+
|
107
|
+
**Fix**: Fixed an issue with PDF Catalog and PDF Page property inheritance that could cause corrupted PDF output (invalid PDF data). Credit to @Kagetsuki for opening an issue that let to this discovery.
|
108
|
+
|
109
|
+
**Fix**: Fixed an issue with the parser where (ignored) empty strings would cause incorrect alignment when converting PDF dictionary objects from an Array to a Hash, mixing up keys and values. Credit to @Kagetsuki for opening an issue that let to this discovery.
|
110
|
+
|
111
|
+
**Fix**: more fixes and refinements to the PDF Names dictionary with better named destination support and document navigation support.
|
112
|
+
|
113
|
+
***
|
114
|
+
|
115
|
+
#### Change log v.0.2.23
|
116
|
+
|
117
|
+
**Fix**: fixed an issue introduced in v.0.2.22, where name dictionary conflict resolution would result in corrupted PDF files. The issue was caused because the name conflict resolution wasn't updated to handle the changes in the new reference linking algorithm used by the parser. During this fix, the whole name dictionary algorithm was re-written, providing better support for named destinations, links and (future feature) ToCs. Credit to Kevin Shen (@kevshin2) for exposing the issue.
|
118
|
+
|
119
|
+
***
|
120
|
+
|
121
|
+
#### Change log v.0.2.22 (yanked)
|
122
|
+
|
123
|
+
**Fix**: fixed an issue with PDF font importing (registering).
|
124
|
+
|
125
|
+
**Fix**: fixed issue #65 where some form data (radio buttons) could be lost. Credit to @joshirashmics for exposing the issue.
|
126
|
+
|
127
|
+
**Fix**: fixed an issue where empty names would be ignored by the parser (who knew they existed...).
|
128
|
+
|
129
|
+
**Fix**: Possible fix for issue #66 (similar to PR #61)... Credit to Serafeim Maroulis (@Reyko) and Kevin Shen (@kevshin2) for exposing the issue.
|
130
|
+
|
131
|
+
**Update**: Rewrote some internal algorithms, avoiding recursive logic and optimizing against excessive stack stress.
|
132
|
+
|
133
|
+
**Feature**: Credit to Joel Williams (@joelw) for providing `CombinePDF.load` and `CombinePDF.parse` customization, allowing optional content errors to be ignored - taking the risk of a corrupt PDF instead of raising an exception (hey, loading PDF data with optional content sometimes works).
|
134
|
+
|
135
|
+
***
|
136
|
+
|
137
|
+
#### Change log v.0.2.21
|
138
|
+
|
139
|
+
**Fix**: fix for issue #54 and #59 (duplicate), discovered by @iggant (Anton Kolodii), related to name conflict resolution and page resources. The issue would cause and error (exception) to occur when attempting to merge pages with specific resource structures. Credit to @cw6365 (Chris Ward) and @DenKey (Den) as well.
|
140
|
+
|
141
|
+
***
|
142
|
+
|
143
|
+
#### Change log v.0.2.20
|
144
|
+
|
145
|
+
**Fix**: fix for issue #56, discovered by @LeptonHeavy, regarding errors caused by the new PDF form support feature.
|
146
|
+
|
147
|
+
***
|
148
|
+
|
149
|
+
#### Change log v.0.2.19 (yanked)
|
150
|
+
|
151
|
+
**Partial fix**: unconfirmed fix for issue #56, discovered by @LeptonHeavy, regarding errors caused by the new PDF form support feature.
|
152
|
+
|
153
|
+
***
|
154
|
+
|
155
|
+
#### Change log v.0.2.18 (yanked)
|
156
|
+
|
157
|
+
**Feature**: added minor (read: initial and incomplete) PDF forms support, in an attempt to preserve form data when combining PDF files.
|
158
|
+
|
159
|
+
***
|
160
|
+
|
161
|
+
#### Change log v.0.2.17
|
162
|
+
|
163
|
+
**Feature**: added the `page#crop` method to easily crop a PDF file in accordance with the GWG industry association recommendations (updating the `MediaBox` property rather then the `CropBox`). Credit to @wingleungchoi for this feature.
|
164
|
+
|
165
|
+
***
|
166
|
+
|
167
|
+
#### Change log v.0.2.16
|
168
|
+
|
169
|
+
**Fix**: Fix for issue #49 where specific PDF files containing junk data after the %%EOF marker couldn't be opened (as they were invalid files). The issue was fixed by scanning any trailing data before continuing to parse any PDF file beyond the first %%EOF markers (multiple markers are common when using the PDF format). Credit to @wingleungchoi for providing an example for the issue.
|
170
|
+
|
171
|
+
***
|
172
|
+
|
173
|
+
#### Change log v.0.2.15
|
174
|
+
|
175
|
+
**Fix**: Fix for issue #22 where specific PDF files with nested references could cause page stamping to fail, raising an exception. Credit to @tomascharad for finding the issue.
|
176
|
+
|
177
|
+
***
|
178
|
+
|
179
|
+
#### Change log v.0.2.14
|
180
|
+
|
181
|
+
**Fix**: Fix for issue #39, where certain comments could have caused the object after the comments to be ignored, resulting in parsing errors. Credit to @lgn21st for identifying the issue.
|
182
|
+
|
183
|
+
***
|
184
|
+
|
185
|
+
#### Change log v.0.2.13
|
186
|
+
|
187
|
+
**Fix** fixed issue # 37 reported by @sega (thank you for reporting!), regarding the insability to stamp one PDF page over another when one PDF page used a resource directory propegated with data and another page used a resource directory propegated with references. This was now resolved by checking for references before merging the data.
|
188
|
+
|
189
|
+
**Compatability**: fixed an issue where PrimoPDF would ommit the required EOL marker before the `endstream`. This would cause malformed PDF files to be written and it is now resolved by allowing the required EOL to be optional.
|
190
|
+
|
191
|
+
**Minor**: a minor improvement on the compatability fix related to salvaging PDF data that was misplaced within a PDF comment. This improvement relates to the possibility that there might not be an EOL marker after the `obj` keyword (PaperPort does use an EOL after the `obj` keyword, so this isn't critical).
|
192
|
+
|
193
|
+
***
|
194
|
+
|
195
|
+
#### Change log v.0.2.12
|
196
|
+
|
197
|
+
**Compatability**: fixed issue #36 reported by @vitstradal (thank you for reporting!) regarding PDF files composed by PaperPort. PaperPort (at least version 12) has an issue where PDF data will be placed within a PDF comment. PDF comments start with a "%" sign and end with an EOL marker ("\r" or "\n"). PaperPort ommitted the EOL marker, placing critical data within the comment. A work-around was found by parsing the comment's data and attempting to salvage the misplaced data. This workaround assumes that comments would not contain PDF parsable data at the very end of the comment's line... which is an unsafe assumption. hence, **please let me know if you find _any_ PDF files that worked before the workaround was introduced**.
|
198
|
+
|
199
|
+
***
|
200
|
+
|
201
|
+
#### Change log v.0.2.11
|
202
|
+
|
203
|
+
**Fix**: fix for issue #35 , which was caused by the broken fix for issue #34. Credit to Davek Rupinski for pointing out the issue.
|
204
|
+
|
205
|
+
***
|
206
|
+
|
207
|
+
#### Change log v.0.2.10
|
208
|
+
|
209
|
+
**Fix**: fixed page stamping when the page's content was a referenced object instead or a direct array of content references. Credit to vitstradal for discovering the issue.
|
210
|
+
|
211
|
+
***
|
212
|
+
|
213
|
+
#### Change log v.0.2.9
|
214
|
+
|
215
|
+
**Fix** hopefully fixed issue #33 ([NoMethodError undefined method `[]` for nil:NilClass](https://github.com/boazsegev/combine_pdf/issues/33)).
|
216
|
+
|
217
|
+
***
|
218
|
+
|
219
|
+
#### Change log v.0.2.8
|
220
|
+
|
221
|
+
* **Fix/Feature**: (related to [issue #32](https://github.com/boazsegev/combine_pdf/issues/32))
|
222
|
+
|
223
|
+
Experience shows that it's very difficult to know when to use `page.copy` v.s. `page.copy(true)` before stamping one pdf pages on top (or under) another... So...
|
224
|
+
|
225
|
+
Now there is no longer any need for the guesswork. The process is automated for you.
|
226
|
+
|
227
|
+
The moment CombinePDF recognizes a resource name conflice between two pages (such as both pages using one font name to reference two different fonts), CombinePDF will intrusively rename the incoming page's resources.
|
228
|
+
|
229
|
+
It is true that the intrusive resource renaming is somewhat risky and might require the inflation of some comperssed page data (resulting in bigger file sizes), but this is the only way to attempt and prevent PDF data curruption.
|
230
|
+
|
231
|
+
***
|
232
|
+
|
233
|
+
#### Change log v.0.2.7
|
234
|
+
|
235
|
+
**Fix**: Fixed an issue where a malformed PDF String could cause the parser to hang.
|
236
|
+
|
237
|
+
**Update**: Inner PDF links (links to pages within the PDF file) will now be preserved when importing a whole PDF (although Outlines, for now, are discarede and their related links will be discarded as well). If the same destination page is inserted more than once (the first version will be preferred).
|
238
|
+
|
239
|
+
**Deprecation Warning**: the `Page_Methods#secure_injection`, `Page_Methods#make_unsecure` and `Page_Methods#make_secure` methods are deprecated. Use `Page_Methods#copy(true)` for safeguarding against font/resource conflicts when "stamping" one PDF page over another.
|
240
|
+
|
241
|
+
***
|
242
|
+
|
243
|
+
#### Change log v.0.2.6
|
244
|
+
|
245
|
+
**fixed**: Hasan Iskandar fixed issue #30 - Output file cannot be saved from Adobe Reader with "Save As optimizes for Fast Web View" preference enabled. Thank you Hasan.
|
246
|
+
|
247
|
+
**update**: More parsing error detection; Updated the endstream EOL marker indentifier for safer indentification.
|
248
|
+
|
249
|
+
***
|
250
|
+
|
251
|
+
#### Change log v.0.2.5
|
6
252
|
|
7
253
|
**feature**: circumvents an issue with 'wkhtmltopdf', where sometimes the `endobj` keyword would be missing, causing malformed PDF data. The parser will now attempt to auto-fix any `endobj` missing keywords.
|
8
254
|
|
@@ -10,7 +256,7 @@ Change log v.0.2.5
|
|
10
256
|
|
11
257
|
***
|
12
258
|
|
13
|
-
Change log v.0.2.4
|
259
|
+
#### Change log v.0.2.4
|
14
260
|
|
15
261
|
**fixed**: Fixed the default page sizes which weren't as described in the documentation and now default to US Letter. The documentation was also fixed. No major version bump is declered since the defaults were faulty and weren't as described (fixed a bug, not changed the API).
|
16
262
|
|
@@ -18,7 +264,7 @@ Change log v.0.2.4
|
|
18
264
|
|
19
265
|
***
|
20
266
|
|
21
|
-
Change log v.0.2.3
|
267
|
+
#### Change log v.0.2.3
|
22
268
|
|
23
269
|
**update**: a better general error message for CombinePDF.new
|
24
270
|
|
@@ -30,7 +276,7 @@ Change log v.0.2.3
|
|
30
276
|
|
31
277
|
***
|
32
278
|
|
33
|
-
Change log v.0.2.2
|
279
|
+
#### Change log v.0.2.2
|
34
280
|
|
35
281
|
**fix**: fixed the default value for the :location attribute of PDF#stamp_pages(String, options). Now, instead of the default stamp being written at [:top, :bottom], it's default location will be set to [:center].
|
36
282
|
|
@@ -38,7 +284,7 @@ Change log v.0.2.2
|
|
38
284
|
|
39
285
|
***
|
40
286
|
|
41
|
-
Change log v.0.2.1
|
287
|
+
#### Change log v.0.2.1
|
42
288
|
|
43
289
|
**fix**: better page stamping... or, at least more secure (we hope).
|
44
290
|
|
@@ -50,7 +296,7 @@ Change log v.0.2.1
|
|
50
296
|
|
51
297
|
***
|
52
298
|
|
53
|
-
Change log v.0.2.0
|
299
|
+
#### Change log v.0.2.0
|
54
300
|
|
55
301
|
Refractoring of code and API overhall.
|
56
302
|
|
@@ -62,31 +308,31 @@ Any code relying on inner/advanced API calls might be broken.
|
|
62
308
|
|
63
309
|
***
|
64
310
|
|
65
|
-
Change log v.0.1.23
|
311
|
+
#### Change log v.0.1.23
|
66
312
|
|
67
313
|
**fix**: @kruszczynski fixed an issue with CombinePDF::PDF#number_pages where the page numbering margines were ignored and only the default values were used. Thank you @kruszczynski .
|
68
314
|
|
69
315
|
***
|
70
316
|
|
71
|
-
Change log v.0.1.22
|
317
|
+
#### Change log v.0.1.22
|
72
318
|
|
73
319
|
**fix**: a tested fix for issue #19, where Acrobat Reader would raise an error if page objects in the Catalog were copied by reference instead of copied in full and each was assigned different a unique object id. (possibly an Acrobat Reader Issue workaround) The issue was resolved by exempting page objects from the duplication reduction algorithm, and in this way, forcing duplicates to be copied rather then referenced in the Catalog object.
|
74
320
|
|
75
321
|
***
|
76
322
|
|
77
|
-
Change log v.0.1.21
|
323
|
+
#### Change log v.0.1.21
|
78
324
|
|
79
325
|
**fix**: an attempted fix for issue #19, where the xref table wasn't read on Acrobat Reader, probably due to a double EOL marker at the end of each entry.
|
80
326
|
|
81
327
|
***
|
82
328
|
|
83
|
-
Change log v.0.1.20
|
329
|
+
#### Change log v.0.1.20
|
84
330
|
|
85
331
|
**fix**: due to some PDF files not conforming to the required EOL marker in the endstream object specifications, the parser is now back to a non-strict parsing mode for PDF Stream Objects. Conforming files weren't found to be effected and although it is unlikely, it is possible that they might be effected if the stream object would contain the 'endstream' keyword without the required EOL marker and without intending to end the stream object.
|
86
332
|
|
87
333
|
***
|
88
334
|
|
89
|
-
Change log v.0.1.19
|
335
|
+
#### Change log v.0.1.19
|
90
336
|
|
91
337
|
**fix**: merged @espinosa's fix for issue #16 which affected windows machines.
|
92
338
|
|
@@ -96,13 +342,13 @@ Change log v.0.1.19
|
|
96
342
|
|
97
343
|
***
|
98
344
|
|
99
|
-
Change log v.0.1.18
|
345
|
+
#### Change log v.0.1.18
|
100
346
|
|
101
|
-
**fix**: Thank to Stefan, who reported issue #15 , we discovered that in some cases PDF files presented the wrong PDF standard version, causing an error while attempting to parse their data. The issue has been fixed by allowing the parser to search for PDF Object Streams even when the PDF file claims a PDF version below 1.5.
|
347
|
+
**fix**: Thank to Stefan, who reported issue #15 , we discovered that in some cases PDF files presented the wrong PDF standard version, causing an error while attempting to parse their data. The issue has been fixed by allowing the parser to search for PDF Object Streams even when the PDF file claims a PDF version below 1.5.
|
102
348
|
|
103
349
|
***
|
104
350
|
|
105
|
-
Change log v.0.1.17
|
351
|
+
#### Change log v.0.1.17
|
106
352
|
|
107
353
|
**feature**: Although it was possible to create and add empty PDF pages (at any location), it is now even easier with one method call to add empty pages at the end of a PDF object. It's also possible to add text to these empty pages or stamp them with different content.
|
108
354
|
|
@@ -112,19 +358,19 @@ Change log v.0.1.17
|
|
112
358
|
|
113
359
|
***
|
114
360
|
|
115
|
-
Change log v.0.1.16
|
361
|
+
#### Change log v.0.1.16
|
116
362
|
|
117
363
|
**fix?**: Compatability reports came in showing that some email servers convery new-line (\n) characters to CRLF (\r\n) - corrupting the binary code in the PDF files. This version attemps to fix this by adding more binary characters to the first comment line of the PDF file (right after the header). Most email programs and Antivirus programs should preserve the original EOL character once they recognize the file as binary.
|
118
364
|
|
119
365
|
***
|
120
366
|
|
121
|
-
Change log v.0.1.15
|
367
|
+
#### Change log v.0.1.15
|
122
368
|
|
123
369
|
**features**: added new PDF#Page API to deal with page rotation and orientation. see the docs for more info.
|
124
370
|
|
125
371
|
***
|
126
372
|
|
127
|
-
Change log v.0.1.14
|
373
|
+
#### Change log v.0.1.14
|
128
374
|
|
129
375
|
**changes**: changed the way the PDF Page objects are 'injected' with their methods, so that the PDF#pages method is faster and more methods can be injected into the Hash object. For instance, textbox can now be called on an existing page without creating a PDFWriter object and 'stumping' the new data.
|
130
376
|
|
@@ -132,20 +378,20 @@ Change log v.0.1.14
|
|
132
378
|
|
133
379
|
***
|
134
380
|
|
135
|
-
Change log v.0.1.13
|
381
|
+
#### Change log v.0.1.13
|
136
382
|
|
137
383
|
**fix**: fix for Acrobat Reader compatablity (by removing color-space declarations). Should solve issue #13 , reported originaly by Imanol and Diyei Gomi.
|
138
384
|
|
139
385
|
***
|
140
386
|
|
141
|
-
Change log v.0.1.12
|
387
|
+
#### Change log v.0.1.12
|
142
388
|
|
143
389
|
**fix**: fix for page rotation inheritance.
|
144
390
|
|
145
391
|
**fix**: fix for the issue was discovered while observing issue #13, reported originaly by Imanol and Diyei Gomi. The issue was probably caused by parsing errors introduced while parsing hex strings (a case sensitive method was used by mistake and this is now corrected).
|
146
392
|
***
|
147
393
|
|
148
|
-
Change log v.0.1.11
|
394
|
+
#### Change log v.0.1.11
|
149
395
|
|
150
396
|
**fix**: fixed a bug where Page Resources and ColorSpace data wouldn't be inherited correctly from the Catalog and Pages parent objects. This issue could cause pages to render without all their content intact. This issue is now fixed (although more testing should be done for multiple inheritance).
|
151
397
|
|
@@ -153,31 +399,31 @@ Change log v.0.1.11
|
|
153
399
|
|
154
400
|
***
|
155
401
|
|
156
|
-
Change log v.0.1.10
|
402
|
+
#### Change log v.0.1.10
|
157
403
|
|
158
404
|
**fix**: fixed a typo that prevented access to the CombinePDF::VERSION constant.
|
159
405
|
|
160
406
|
***
|
161
407
|
|
162
|
-
Change log v.0.1.9
|
408
|
+
#### Change log v.0.1.9
|
163
409
|
|
164
410
|
**fix**: possible fix for bug reported by lamphuongha, regarding PDF 1.5 streams. I await confirmation that the fix actually works, as I cannot seem to reproduce the whole spectrum of the bug on my system...
|
165
411
|
|
166
412
|
***
|
167
413
|
|
168
|
-
Change log v.0.1.8
|
414
|
+
#### Change log v.0.1.8
|
169
415
|
|
170
416
|
**fix**: Fixed an [issue reported by Saba](https://github.com/boazsegev/combine_pdf/issues/8), where PDF files that were written using bad practices (namely, without wrapping their content streams correctly) would not be stamped correctly due to changes in the space matrix (CTM). Fixed by wrapping all existing streams before stamping.
|
171
417
|
|
172
418
|
***
|
173
419
|
|
174
|
-
Change log v.0.1.7
|
420
|
+
#### Change log v.0.1.7
|
175
421
|
|
176
422
|
**fix**: PDF `insert` had a typo in the code that would cause failure when unsupported object insertion was attempted - fixed by Nathan Keyes (nkeyes).
|
177
423
|
|
178
424
|
***
|
179
425
|
|
180
|
-
Change log v.0.1.6
|
426
|
+
#### Change log v.0.1.6
|
181
427
|
|
182
428
|
**fix**: added Mutex to font library (which was shared by all PDFWriter objects) - now fonts are thread safe (PDF objects are NOT thread safe by design).
|
183
429
|
|
@@ -187,4 +433,4 @@ Change log v.0.1.6
|
|
187
433
|
|
188
434
|
**known issues**: encrypted PDF files can sometimes silently fail (producing empty pages) - this is because on an attempted decrypt. more work should be done to support encrypted PDF files. please feel fee to help.
|
189
435
|
|
190
|
-
I use this version on production, where I have control over the PDF files I use. It is beter then system calls to pdftk (which can cause all threads in ruby to hold, effectively causing my web app to hang).
|
436
|
+
I use this version on production, where I have control over the PDF files I use. It is beter then system calls to pdftk (which can cause all threads in ruby to hold, effectively causing my web app to hang).
|
data/LICENSE.txt
CHANGED
data/README.md
CHANGED
@@ -1,9 +1,10 @@
|
|
1
1
|
# CombinePDF - the ruby way for merging PDF files
|
2
2
|
[![Gem Version](https://badge.fury.io/rb/combine_pdf.svg)](http://badge.fury.io/rb/combine_pdf)
|
3
|
+
[![GitHub](https://img.shields.io/badge/GitHub-Open%20Source-blue.svg)](https://github.com/boazsegev/combine_pdf)
|
3
4
|
|
4
5
|
CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).
|
5
6
|
|
6
|
-
|
7
|
+
## Install
|
7
8
|
|
8
9
|
Install with ruby gems:
|
9
10
|
|
@@ -11,6 +12,34 @@ Install with ruby gems:
|
|
11
12
|
gem install combine_pdf
|
12
13
|
```
|
13
14
|
|
15
|
+
## Known Limitations
|
16
|
+
|
17
|
+
Quick rundown:
|
18
|
+
|
19
|
+
* When reading PDF Forms, some form data might be lost. I tried fixing this to the best of my ability, but I'm not sure it all works just yet.
|
20
|
+
|
21
|
+
* When combining PDF Forms, form data might be unified. I couldn't fix this because this is how PDF forms work (filling a feild fills in the data in any field with the same name), but frankly, I kinda liked the issue... it's almost a feature.
|
22
|
+
|
23
|
+
* When unifying the same TOC data more then once, one of the references will be unified with the other (meaning that if the pages look the same, both references will link to the same page instead of linking to two different pages). You can fix this by adding content to the pages before merging the PDF files (i.e. add empty text boxes to all the pages).
|
24
|
+
|
25
|
+
* Some links and data (URL links and PDF "Named Destinations") are stored at the root of a PDF and they aren't linked back to from the page. Keeping this information requires merging the PDF objects rather then their pages.
|
26
|
+
|
27
|
+
Some links will be lost when ripping pages out of PDF files and merging them with another PDF.
|
28
|
+
|
29
|
+
* Some encrypted PDF files (usually the ones you can't view without a password) will fail quietly instead of noisily.
|
30
|
+
|
31
|
+
* Sometimes the CombinePDF will raise an exception even if the PDF could be parsed (i.e., when PDF optional content exists)... I find it better to err on the side of caution, although for optional content PDFs an exception is avoidable using `CombinePDF.load(pdf_file, allow_optional_content: true)`.
|
32
|
+
|
33
|
+
CombinePDF is written natively in Ruby and should (presumably) work on all Ruby platforms that follow Ruby 2.0 compatibility.
|
34
|
+
|
35
|
+
However, PDF files are quite complex creatures and no guaranty is provided.
|
36
|
+
|
37
|
+
For example, PDF Forms are known to have issues and form data might be lost when attempting to combine PDFs with filled form data (also, forms are global objects, not page specific, so one should combine the whole of the PDF for any data to have any chance of being preserved).
|
38
|
+
|
39
|
+
The same applies to PDF links and the table of contents, which all have global attributes and could be corrupted or lost when combining PDF data.
|
40
|
+
|
41
|
+
If this library causes loss of data or burns down your house, I'm not to blame - as pointed to by the MIT license. That being said, I'm using the library happily after testing against different solutions.
|
42
|
+
|
14
43
|
## Combine/Merge PDF files or Pages
|
15
44
|
|
16
45
|
To combine PDF files (or data):
|
@@ -74,7 +103,7 @@ pdf.save "file_with_numbering.pdf"
|
|
74
103
|
|
75
104
|
Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - see documentation.
|
76
105
|
|
77
|
-
## Loading and
|
106
|
+
## Loading and Parsing PDF data
|
78
107
|
|
79
108
|
Loading PDF data can be done from file system or directly from the memory.
|
80
109
|
|
@@ -91,14 +120,28 @@ pdf_data = prawn_pdf_document.render # Import PDF data from Prawn
|
|
91
120
|
pdf = CombinePDF.parse(pdf_data)
|
92
121
|
```
|
93
122
|
|
94
|
-
|
123
|
+
Using `parse` is also effective when loading data from a remote location, circumventing the need for unnecessary temporary files. For example:
|
124
|
+
|
125
|
+
```ruby
|
126
|
+
require 'combine_pdf'
|
127
|
+
require 'net/http'
|
128
|
+
|
129
|
+
url = "https://example.com/my.pdf"
|
130
|
+
pdf = CombinePDF.parse Net::HTTP.get_response(URI.parse(url)).body
|
131
|
+
```
|
132
|
+
|
133
|
+
## Rendering PDF data
|
134
|
+
|
135
|
+
Similarly, to loading and parsing, rendering can also be performed either to the memory or to a file.
|
136
|
+
|
137
|
+
You can output a string of PDF data using `.to_pdf`. For example, to let a user download the PDF from either a [Rails application](http://rubyonrails.org) or a [Plezi application](http://www.plezi.io):
|
95
138
|
|
96
139
|
```ruby
|
97
140
|
# in a controller action
|
98
141
|
send_data combined_file.to_pdf, filename: "combined.pdf", type: "application/pdf"
|
99
142
|
```
|
100
143
|
|
101
|
-
|
144
|
+
In [Sinatra](http://www.sinatrarb.com):
|
102
145
|
|
103
146
|
```ruby
|
104
147
|
# in your path's block
|
@@ -107,8 +150,19 @@ body combined_file.to_pdf
|
|
107
150
|
headers 'content-type' => "application/pdf"
|
108
151
|
```
|
109
152
|
|
153
|
+
|
110
154
|
If you prefer to save the PDF data to a file, you can always use the `save` method as we did in our earlier examples.
|
111
155
|
|
156
|
+
Some PDF files contain optional content sections which cannot always be merged reliably. By default, an exception is
|
157
|
+
raised if one of these files are detected. You can optionally pass an `allow_optional_content` parameter to the
|
158
|
+
`PDFParser.new`, `CombinePDF.load` and `CombinePDF.parse` methods:
|
159
|
+
|
160
|
+
```ruby
|
161
|
+
new_pdf = CombinePDF.new
|
162
|
+
new_pdf << CombinePDF.load(pdf_file, allow_optional_content: true)
|
163
|
+
attachments.each { |att| new_pdf << CombinePDF.load(att, allow_optional_content: true) }
|
164
|
+
```
|
165
|
+
|
112
166
|
Demo
|
113
167
|
====
|
114
168
|
|
@@ -135,6 +189,8 @@ The code itself should be very straight forward, but feel free to ask whatever y
|
|
135
189
|
Credit
|
136
190
|
======
|
137
191
|
|
192
|
+
Stefan Leitner (@sLe1tner) wrote the outline merging code supporting PDFs which contain a ToC.
|
193
|
+
|
138
194
|
Caige Nichols wrote an amazing RC4 gem which I used in my code.
|
139
195
|
|
140
196
|
I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
|
@@ -144,3 +200,12 @@ Credit to his wonderful is given here. Please respect his license and copyright.
|
|
144
200
|
License
|
145
201
|
=======
|
146
202
|
MIT
|
203
|
+
|
204
|
+
Contributions
|
205
|
+
=======
|
206
|
+
|
207
|
+
You can look at the [GitHub Issues Page](https://github.com/boazsegev/combine_pdf/issues) and see the ["help wanted"](https://github.com/boazsegev/combine_pdf/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) tags.
|
208
|
+
|
209
|
+
If you're thinking of donations or sending me money - no need. This project can sustain itself without your money.
|
210
|
+
|
211
|
+
What this project needs is the time given by caring developers who keep it up to date and fix any documentation errors or issues they notice ... having said that, gifts (such as free coffee or iTunes gift cards) are always fun. But I think there are those in real need that will benefit more from your generosity.
|