combine_pdf 0.2.5 → 0.2.37

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 61c4d75ddd1975e567b4b96359f852a4eb13fdf7
4
- data.tar.gz: 304f0b46cf41a96eddc46c728a20caf9ff77912c
3
+ metadata.gz: cac27b28f3653156374b1ea4a429676625ba0c9f
4
+ data.tar.gz: 8ce9f60a9bdcbd763a72461703c51845dbab0f2c
5
5
  SHA512:
6
- metadata.gz: 256814f346635778e23265a652ce800e3c951e21e647943f752b2d231ddc68c90cc6ffeb7cfacaadd34b42faff97233637752cbd68329ff064728d673c989fdf
7
- data.tar.gz: 29c90166699f5baf305398a1382e2f0db4ceb00e35391955cb5f6a30012a445fd48f3ce5e2d5154851ad1b9ae8d580f286de71ffe8b881e580b051faa1e1ec13
6
+ metadata.gz: 78aa47281a6f9fa5723a99ed9ce666479999348b69a19778e02eb2144e1c507d4a62ac23a07e4b46079193d14fecf77cbb9dac06591ac2354e92046ba0ba5d20
7
+ data.tar.gz: 2b92948efba5ab031a46865416b13b1aecb892a1763c0d45190598313e6e0139907ece00c129349060ff030c7b38531d1d848f9696168d5fec499a4c65121db8
data/.gitignore CHANGED
@@ -12,3 +12,5 @@
12
12
  *.o
13
13
  *.a
14
14
  mkmf.log
15
+
16
+ .DS_Store
data/CHANGELOG.md CHANGED
@@ -1,8 +1,254 @@
1
- #Change Log
1
+ # Change Log
2
2
 
3
3
  ***
4
4
 
5
- Change log v.0.2.5
5
+ #### Change log v.0.2.37 (Release Candidate)
6
+
7
+ **Fix**: Fixed `Page_Methods#textbox` default `:x`,`:y` to allow for non-zero/cropped page origin. Credit to @donnguyen for exposing the issue.
8
+
9
+ **Fix**: Fix typo on Parser error message for general parser error. Credit to @axlekb.
10
+
11
+ ***
12
+
13
+ #### Change log v.0.2.36
14
+
15
+ **Fix**: Fix for [issue #104](https://github.com/boazsegev/combine_pdf/issues/104). Credit to @tomascharad for exposing the issue.
16
+
17
+ **Release**: This gem had been using a development versioning scheme for far too long. The API is stable enough to switch to a production versioning scheme. This version is expected to be the last 0.x version. Assuming this version will be stable enough, it is expected to be re-released as v.1.0.
18
+
19
+
20
+ ***
21
+
22
+ #### Change log v.0.2.35
23
+
24
+ **Update**: Updated / upgraded our RC4 and AES PDF encryption support (for non-password protected PDFs). Credit to Gyuchang Jun (@gyuchang) for his work on providing CombinePDF with this extra encryption support. I have no idea what magic he used to make this happen, but it's beautiful!
25
+
26
+
27
+ ***
28
+
29
+ #### Change log v.0.2.34
30
+
31
+ **Fix**: [fixed issue #44 for wkhtmltopdf compatibility](https://github.com/boazsegev/combine_pdf/issues/44) and PDF v.1.2 use of named destinations. Credit to Devin Wadsworth (@daymun) for exposing the issue.
32
+
33
+ ***
34
+
35
+ #### Change log v.0.2.33
36
+
37
+ **Update**: Fix #97 to allow javascript support for interactive objects. Credit to @joshirashmics for exposing the issue.
38
+
39
+ **Update**: Extended "named tree" support now preserves some advanced PDF feature that weren't supported before.
40
+
41
+ **Deprecation**: Ruby is deprecating `Fixnum`, as so is CombinePDF... replaced all `Fixnum` occurrences with `Integer`.
42
+
43
+ ***
44
+
45
+ #### Change log v.0.2.32
46
+
47
+ **Update**: Better errors when encryption related exceptions occur. Credit to Paul Shumeika ( @pshumeika ).
48
+
49
+ **Fix**: Fixed an issue where empty pages with NULL contents value would cause CombinePDF to raise an exception when rendering. Credit to @holtmaat and Jason DeLeon (@progmem) (both in submitted different PRs regarding the issue).
50
+
51
+ ***
52
+
53
+ #### Change log v.0.2.31
54
+
55
+ **Broke**: Broke the fix for issue #65 so that Radio buttons data might be lost... working on a fix.
56
+
57
+ **Fix**: Fixed issue #82 (reintroduction of issue #19 due to core engine rewrite) related to a workaround for an issue with AcrobatReader. Credit to @gyuchang for testing and helping with the fix.
58
+
59
+ **Merge**: Merged pull request #80, fixing an issue with byte decoding. Credit to @gyuchang for the PR.
60
+
61
+ **Performance**: Improved performance for the reference and duplicate object resolution. Credit to @gyuchang for pointing some optimization options.
62
+
63
+ ***
64
+
65
+ #### Change log v.0.2.30
66
+
67
+ **Fix**: Fixed an issue where HTTP artifacts before the beginning of a PDF file / string would prevent the PDF from being parsed. This should fix issue #78 reported by @robvitaro.
68
+
69
+ ***
70
+
71
+ #### Change log v.0.2.29
72
+
73
+ **Fix**: Fixed an issue where updating a page's rotation might raise a `NoMethodError` exception. Credit to Danny (@dikond) both for discovering the issue and for PR #77 that fixes this.
74
+
75
+ ***
76
+
77
+ #### Change log v.0.2.28
78
+
79
+ **Fix**: Fixed an issue related to page stumping, which was introduced when the Rubocop beautification changed the logic of an `if` statement in the Resource merger. Credit to Leon Miller-Out (@sbleon) for noticing the issue, testing and opening PR #76.
80
+
81
+ ***
82
+
83
+ #### Change log v.0.2.27
84
+
85
+ **Fix**: Fixed an issue where a `nil` outline count would cause PDF merger to fail.
86
+
87
+ **Fix**: Fixed an issue where `nil` data would cause the named destination rebuilding process to quit early, leaving some of the data unprocessed. Credit to Stefan Leitner (@sLe1tner) for exposing the issue.
88
+
89
+ **Feature**: PDF outlines are now merged and named destination links are preserved (both in the outlines and the page content). Credit to Stefan Leitner (@sLe1tner) for this feature.
90
+
91
+ ***
92
+
93
+ #### Change log v.0.2.26
94
+
95
+ **Fix**: Merged PR #72, fixing a typo in the parser that caused incorrect byte substitution to corrupt certain PDF data (adversely effecting encrypted PDFs). Credit to Gyuchang Jun (@gyuchang) for the fix.
96
+
97
+ ***
98
+
99
+ #### Change log v.0.2.25
100
+
101
+ **Fix**: Fixed issue #71, merging PDF outline that exist but have 0 entries fails and raises an exception. Credit to @Kagetsuki for exposing the issue.
102
+
103
+ ***
104
+
105
+ #### Change log v.0.2.24
106
+
107
+ **Fix**: Fixed an issue with PDF Catalog and PDF Page property inheritance that could cause corrupted PDF output (invalid PDF data). Credit to @Kagetsuki for opening an issue that let to this discovery.
108
+
109
+ **Fix**: Fixed an issue with the parser where (ignored) empty strings would cause incorrect alignment when converting PDF dictionary objects from an Array to a Hash, mixing up keys and values. Credit to @Kagetsuki for opening an issue that let to this discovery.
110
+
111
+ **Fix**: more fixes and refinements to the PDF Names dictionary with better named destination support and document navigation support.
112
+
113
+ ***
114
+
115
+ #### Change log v.0.2.23
116
+
117
+ **Fix**: fixed an issue introduced in v.0.2.22, where name dictionary conflict resolution would result in corrupted PDF files. The issue was caused because the name conflict resolution wasn't updated to handle the changes in the new reference linking algorithm used by the parser. During this fix, the whole name dictionary algorithm was re-written, providing better support for named destinations, links and (future feature) ToCs. Credit to Kevin Shen (@kevshin2) for exposing the issue.
118
+
119
+ ***
120
+
121
+ #### Change log v.0.2.22 (yanked)
122
+
123
+ **Fix**: fixed an issue with PDF font importing (registering).
124
+
125
+ **Fix**: fixed issue #65 where some form data (radio buttons) could be lost. Credit to @joshirashmics for exposing the issue.
126
+
127
+ **Fix**: fixed an issue where empty names would be ignored by the parser (who knew they existed...).
128
+
129
+ **Fix**: Possible fix for issue #66 (similar to PR #61)... Credit to Serafeim Maroulis (@Reyko) and Kevin Shen (@kevshin2) for exposing the issue.
130
+
131
+ **Update**: Rewrote some internal algorithms, avoiding recursive logic and optimizing against excessive stack stress.
132
+
133
+ **Feature**: Credit to Joel Williams (@joelw) for providing `CombinePDF.load` and `CombinePDF.parse` customization, allowing optional content errors to be ignored - taking the risk of a corrupt PDF instead of raising an exception (hey, loading PDF data with optional content sometimes works).
134
+
135
+ ***
136
+
137
+ #### Change log v.0.2.21
138
+
139
+ **Fix**: fix for issue #54 and #59 (duplicate), discovered by @iggant (Anton Kolodii), related to name conflict resolution and page resources. The issue would cause and error (exception) to occur when attempting to merge pages with specific resource structures. Credit to @cw6365 (Chris Ward) and @DenKey (Den) as well.
140
+
141
+ ***
142
+
143
+ #### Change log v.0.2.20
144
+
145
+ **Fix**: fix for issue #56, discovered by @LeptonHeavy, regarding errors caused by the new PDF form support feature.
146
+
147
+ ***
148
+
149
+ #### Change log v.0.2.19 (yanked)
150
+
151
+ **Partial fix**: unconfirmed fix for issue #56, discovered by @LeptonHeavy, regarding errors caused by the new PDF form support feature.
152
+
153
+ ***
154
+
155
+ #### Change log v.0.2.18 (yanked)
156
+
157
+ **Feature**: added minor (read: initial and incomplete) PDF forms support, in an attempt to preserve form data when combining PDF files.
158
+
159
+ ***
160
+
161
+ #### Change log v.0.2.17
162
+
163
+ **Feature**: added the `page#crop` method to easily crop a PDF file in accordance with the GWG industry association recommendations (updating the `MediaBox` property rather then the `CropBox`). Credit to @wingleungchoi for this feature.
164
+
165
+ ***
166
+
167
+ #### Change log v.0.2.16
168
+
169
+ **Fix**: Fix for issue #49 where specific PDF files containing junk data after the %%EOF marker couldn't be opened (as they were invalid files). The issue was fixed by scanning any trailing data before continuing to parse any PDF file beyond the first %%EOF markers (multiple markers are common when using the PDF format). Credit to @wingleungchoi for providing an example for the issue.
170
+
171
+ ***
172
+
173
+ #### Change log v.0.2.15
174
+
175
+ **Fix**: Fix for issue #22 where specific PDF files with nested references could cause page stamping to fail, raising an exception. Credit to @tomascharad for finding the issue.
176
+
177
+ ***
178
+
179
+ #### Change log v.0.2.14
180
+
181
+ **Fix**: Fix for issue #39, where certain comments could have caused the object after the comments to be ignored, resulting in parsing errors. Credit to @lgn21st for identifying the issue.
182
+
183
+ ***
184
+
185
+ #### Change log v.0.2.13
186
+
187
+ **Fix** fixed issue # 37 reported by @sega (thank you for reporting!), regarding the insability to stamp one PDF page over another when one PDF page used a resource directory propegated with data and another page used a resource directory propegated with references. This was now resolved by checking for references before merging the data.
188
+
189
+ **Compatability**: fixed an issue where PrimoPDF would ommit the required EOL marker before the `endstream`. This would cause malformed PDF files to be written and it is now resolved by allowing the required EOL to be optional.
190
+
191
+ **Minor**: a minor improvement on the compatability fix related to salvaging PDF data that was misplaced within a PDF comment. This improvement relates to the possibility that there might not be an EOL marker after the `obj` keyword (PaperPort does use an EOL after the `obj` keyword, so this isn't critical).
192
+
193
+ ***
194
+
195
+ #### Change log v.0.2.12
196
+
197
+ **Compatability**: fixed issue #36 reported by @vitstradal (thank you for reporting!) regarding PDF files composed by PaperPort. PaperPort (at least version 12) has an issue where PDF data will be placed within a PDF comment. PDF comments start with a "%" sign and end with an EOL marker ("\r" or "\n"). PaperPort ommitted the EOL marker, placing critical data within the comment. A work-around was found by parsing the comment's data and attempting to salvage the misplaced data. This workaround assumes that comments would not contain PDF parsable data at the very end of the comment's line... which is an unsafe assumption. hence, **please let me know if you find _any_ PDF files that worked before the workaround was introduced**.
198
+
199
+ ***
200
+
201
+ #### Change log v.0.2.11
202
+
203
+ **Fix**: fix for issue #35 , which was caused by the broken fix for issue #34. Credit to Davek Rupinski for pointing out the issue.
204
+
205
+ ***
206
+
207
+ #### Change log v.0.2.10
208
+
209
+ **Fix**: fixed page stamping when the page's content was a referenced object instead or a direct array of content references. Credit to vitstradal for discovering the issue.
210
+
211
+ ***
212
+
213
+ #### Change log v.0.2.9
214
+
215
+ **Fix** hopefully fixed issue #33 ([NoMethodError undefined method `[]` for nil:NilClass](https://github.com/boazsegev/combine_pdf/issues/33)).
216
+
217
+ ***
218
+
219
+ #### Change log v.0.2.8
220
+
221
+ * **Fix/Feature**: (related to [issue #32](https://github.com/boazsegev/combine_pdf/issues/32))
222
+
223
+ Experience shows that it's very difficult to know when to use `page.copy` v.s. `page.copy(true)` before stamping one pdf pages on top (or under) another... So...
224
+
225
+ Now there is no longer any need for the guesswork. The process is automated for you.
226
+
227
+ The moment CombinePDF recognizes a resource name conflice between two pages (such as both pages using one font name to reference two different fonts), CombinePDF will intrusively rename the incoming page's resources.
228
+
229
+ It is true that the intrusive resource renaming is somewhat risky and might require the inflation of some comperssed page data (resulting in bigger file sizes), but this is the only way to attempt and prevent PDF data curruption.
230
+
231
+ ***
232
+
233
+ #### Change log v.0.2.7
234
+
235
+ **Fix**: Fixed an issue where a malformed PDF String could cause the parser to hang.
236
+
237
+ **Update**: Inner PDF links (links to pages within the PDF file) will now be preserved when importing a whole PDF (although Outlines, for now, are discarede and their related links will be discarded as well). If the same destination page is inserted more than once (the first version will be preferred).
238
+
239
+ **Deprecation Warning**: the `Page_Methods#secure_injection`, `Page_Methods#make_unsecure` and `Page_Methods#make_secure` methods are deprecated. Use `Page_Methods#copy(true)` for safeguarding against font/resource conflicts when "stamping" one PDF page over another.
240
+
241
+ ***
242
+
243
+ #### Change log v.0.2.6
244
+
245
+ **fixed**: Hasan Iskandar fixed issue #30 - Output file cannot be saved from Adobe Reader with "Save As optimizes for Fast Web View" preference enabled. Thank you Hasan.
246
+
247
+ **update**: More parsing error detection; Updated the endstream EOL marker indentifier for safer indentification.
248
+
249
+ ***
250
+
251
+ #### Change log v.0.2.5
6
252
 
7
253
  **feature**: circumvents an issue with 'wkhtmltopdf', where sometimes the `endobj` keyword would be missing, causing malformed PDF data. The parser will now attempt to auto-fix any `endobj` missing keywords.
8
254
 
@@ -10,7 +256,7 @@ Change log v.0.2.5
10
256
 
11
257
  ***
12
258
 
13
- Change log v.0.2.4
259
+ #### Change log v.0.2.4
14
260
 
15
261
  **fixed**: Fixed the default page sizes which weren't as described in the documentation and now default to US Letter. The documentation was also fixed. No major version bump is declered since the defaults were faulty and weren't as described (fixed a bug, not changed the API).
16
262
 
@@ -18,7 +264,7 @@ Change log v.0.2.4
18
264
 
19
265
  ***
20
266
 
21
- Change log v.0.2.3
267
+ #### Change log v.0.2.3
22
268
 
23
269
  **update**: a better general error message for CombinePDF.new
24
270
 
@@ -30,7 +276,7 @@ Change log v.0.2.3
30
276
 
31
277
  ***
32
278
 
33
- Change log v.0.2.2
279
+ #### Change log v.0.2.2
34
280
 
35
281
  **fix**: fixed the default value for the :location attribute of PDF#stamp_pages(String, options). Now, instead of the default stamp being written at [:top, :bottom], it's default location will be set to [:center].
36
282
 
@@ -38,7 +284,7 @@ Change log v.0.2.2
38
284
 
39
285
  ***
40
286
 
41
- Change log v.0.2.1
287
+ #### Change log v.0.2.1
42
288
 
43
289
  **fix**: better page stamping... or, at least more secure (we hope).
44
290
 
@@ -50,7 +296,7 @@ Change log v.0.2.1
50
296
 
51
297
  ***
52
298
 
53
- Change log v.0.2.0
299
+ #### Change log v.0.2.0
54
300
 
55
301
  Refractoring of code and API overhall.
56
302
 
@@ -62,31 +308,31 @@ Any code relying on inner/advanced API calls might be broken.
62
308
 
63
309
  ***
64
310
 
65
- Change log v.0.1.23
311
+ #### Change log v.0.1.23
66
312
 
67
313
  **fix**: @kruszczynski fixed an issue with CombinePDF::PDF#number_pages where the page numbering margines were ignored and only the default values were used. Thank you @kruszczynski .
68
314
 
69
315
  ***
70
316
 
71
- Change log v.0.1.22
317
+ #### Change log v.0.1.22
72
318
 
73
319
  **fix**: a tested fix for issue #19, where Acrobat Reader would raise an error if page objects in the Catalog were copied by reference instead of copied in full and each was assigned different a unique object id. (possibly an Acrobat Reader Issue workaround) The issue was resolved by exempting page objects from the duplication reduction algorithm, and in this way, forcing duplicates to be copied rather then referenced in the Catalog object.
74
320
 
75
321
  ***
76
322
 
77
- Change log v.0.1.21
323
+ #### Change log v.0.1.21
78
324
 
79
325
  **fix**: an attempted fix for issue #19, where the xref table wasn't read on Acrobat Reader, probably due to a double EOL marker at the end of each entry.
80
326
 
81
327
  ***
82
328
 
83
- Change log v.0.1.20
329
+ #### Change log v.0.1.20
84
330
 
85
331
  **fix**: due to some PDF files not conforming to the required EOL marker in the endstream object specifications, the parser is now back to a non-strict parsing mode for PDF Stream Objects. Conforming files weren't found to be effected and although it is unlikely, it is possible that they might be effected if the stream object would contain the 'endstream' keyword without the required EOL marker and without intending to end the stream object.
86
332
 
87
333
  ***
88
334
 
89
- Change log v.0.1.19
335
+ #### Change log v.0.1.19
90
336
 
91
337
  **fix**: merged @espinosa's fix for issue #16 which affected windows machines.
92
338
 
@@ -96,13 +342,13 @@ Change log v.0.1.19
96
342
 
97
343
  ***
98
344
 
99
- Change log v.0.1.18
345
+ #### Change log v.0.1.18
100
346
 
101
- **fix**: Thank to Stefan, who reported issue #15 , we discovered that in some cases PDF files presented the wrong PDF standard version, causing an error while attempting to parse their data. The issue has been fixed by allowing the parser to search for PDF Object Streams even when the PDF file claims a PDF version below 1.5.
347
+ **fix**: Thank to Stefan, who reported issue #15 , we discovered that in some cases PDF files presented the wrong PDF standard version, causing an error while attempting to parse their data. The issue has been fixed by allowing the parser to search for PDF Object Streams even when the PDF file claims a PDF version below 1.5.
102
348
 
103
349
  ***
104
350
 
105
- Change log v.0.1.17
351
+ #### Change log v.0.1.17
106
352
 
107
353
  **feature**: Although it was possible to create and add empty PDF pages (at any location), it is now even easier with one method call to add empty pages at the end of a PDF object. It's also possible to add text to these empty pages or stamp them with different content.
108
354
 
@@ -112,19 +358,19 @@ Change log v.0.1.17
112
358
 
113
359
  ***
114
360
 
115
- Change log v.0.1.16
361
+ #### Change log v.0.1.16
116
362
 
117
363
  **fix?**: Compatability reports came in showing that some email servers convery new-line (\n) characters to CRLF (\r\n) - corrupting the binary code in the PDF files. This version attemps to fix this by adding more binary characters to the first comment line of the PDF file (right after the header). Most email programs and Antivirus programs should preserve the original EOL character once they recognize the file as binary.
118
364
 
119
365
  ***
120
366
 
121
- Change log v.0.1.15
367
+ #### Change log v.0.1.15
122
368
 
123
369
  **features**: added new PDF#Page API to deal with page rotation and orientation. see the docs for more info.
124
370
 
125
371
  ***
126
372
 
127
- Change log v.0.1.14
373
+ #### Change log v.0.1.14
128
374
 
129
375
  **changes**: changed the way the PDF Page objects are 'injected' with their methods, so that the PDF#pages method is faster and more methods can be injected into the Hash object. For instance, textbox can now be called on an existing page without creating a PDFWriter object and 'stumping' the new data.
130
376
 
@@ -132,20 +378,20 @@ Change log v.0.1.14
132
378
 
133
379
  ***
134
380
 
135
- Change log v.0.1.13
381
+ #### Change log v.0.1.13
136
382
 
137
383
  **fix**: fix for Acrobat Reader compatablity (by removing color-space declarations). Should solve issue #13 , reported originaly by Imanol and Diyei Gomi.
138
384
 
139
385
  ***
140
386
 
141
- Change log v.0.1.12
387
+ #### Change log v.0.1.12
142
388
 
143
389
  **fix**: fix for page rotation inheritance.
144
390
 
145
391
  **fix**: fix for the issue was discovered while observing issue #13, reported originaly by Imanol and Diyei Gomi. The issue was probably caused by parsing errors introduced while parsing hex strings (a case sensitive method was used by mistake and this is now corrected).
146
392
  ***
147
393
 
148
- Change log v.0.1.11
394
+ #### Change log v.0.1.11
149
395
 
150
396
  **fix**: fixed a bug where Page Resources and ColorSpace data wouldn't be inherited correctly from the Catalog and Pages parent objects. This issue could cause pages to render without all their content intact. This issue is now fixed (although more testing should be done for multiple inheritance).
151
397
 
@@ -153,31 +399,31 @@ Change log v.0.1.11
153
399
 
154
400
  ***
155
401
 
156
- Change log v.0.1.10
402
+ #### Change log v.0.1.10
157
403
 
158
404
  **fix**: fixed a typo that prevented access to the CombinePDF::VERSION constant.
159
405
 
160
406
  ***
161
407
 
162
- Change log v.0.1.9
408
+ #### Change log v.0.1.9
163
409
 
164
410
  **fix**: possible fix for bug reported by lamphuongha, regarding PDF 1.5 streams. I await confirmation that the fix actually works, as I cannot seem to reproduce the whole spectrum of the bug on my system...
165
411
 
166
412
  ***
167
413
 
168
- Change log v.0.1.8
414
+ #### Change log v.0.1.8
169
415
 
170
416
  **fix**: Fixed an [issue reported by Saba](https://github.com/boazsegev/combine_pdf/issues/8), where PDF files that were written using bad practices (namely, without wrapping their content streams correctly) would not be stamped correctly due to changes in the space matrix (CTM). Fixed by wrapping all existing streams before stamping.
171
417
 
172
418
  ***
173
419
 
174
- Change log v.0.1.7
420
+ #### Change log v.0.1.7
175
421
 
176
422
  **fix**: PDF `insert` had a typo in the code that would cause failure when unsupported object insertion was attempted - fixed by Nathan Keyes (nkeyes).
177
423
 
178
424
  ***
179
425
 
180
- Change log v.0.1.6
426
+ #### Change log v.0.1.6
181
427
 
182
428
  **fix**: added Mutex to font library (which was shared by all PDFWriter objects) - now fonts are thread safe (PDF objects are NOT thread safe by design).
183
429
 
@@ -187,4 +433,4 @@ Change log v.0.1.6
187
433
 
188
434
  **known issues**: encrypted PDF files can sometimes silently fail (producing empty pages) - this is because on an attempted decrypt. more work should be done to support encrypted PDF files. please feel fee to help.
189
435
 
190
- I use this version on production, where I have control over the PDF files I use. It is beter then system calls to pdftk (which can cause all threads in ruby to hold, effectively causing my web app to hang).
436
+ I use this version on production, where I have control over the PDF files I use. It is beter then system calls to pdftk (which can cause all threads in ruby to hold, effectively causing my web app to hang).
data/LICENSE.txt CHANGED
@@ -1,4 +1,5 @@
1
- Copyright (c) 2014 Myst
1
+ Copyright (c) 2014,2015,2016 Boaz Segev (Myst - @boazsegev)
2
+ Copyright (c) 2016 Stefan Leitner (@sLe1tner) (credit for outline merging code)
2
3
 
3
4
  MIT License
4
5
 
data/README.md CHANGED
@@ -1,9 +1,10 @@
1
1
  # CombinePDF - the ruby way for merging PDF files
2
2
  [![Gem Version](https://badge.fury.io/rb/combine_pdf.svg)](http://badge.fury.io/rb/combine_pdf)
3
+ [![GitHub](https://img.shields.io/badge/GitHub-Open%20Source-blue.svg)](https://github.com/boazsegev/combine_pdf)
3
4
 
4
5
  CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).
5
6
 
6
- # Install
7
+ ## Install
7
8
 
8
9
  Install with ruby gems:
9
10
 
@@ -11,6 +12,34 @@ Install with ruby gems:
11
12
  gem install combine_pdf
12
13
  ```
13
14
 
15
+ ## Known Limitations
16
+
17
+ Quick rundown:
18
+
19
+ * When reading PDF Forms, some form data might be lost. I tried fixing this to the best of my ability, but I'm not sure it all works just yet.
20
+
21
+ * When combining PDF Forms, form data might be unified. I couldn't fix this because this is how PDF forms work (filling a feild fills in the data in any field with the same name), but frankly, I kinda liked the issue... it's almost a feature.
22
+
23
+ * When unifying the same TOC data more then once, one of the references will be unified with the other (meaning that if the pages look the same, both references will link to the same page instead of linking to two different pages). You can fix this by adding content to the pages before merging the PDF files (i.e. add empty text boxes to all the pages).
24
+
25
+ * Some links and data (URL links and PDF "Named Destinations") are stored at the root of a PDF and they aren't linked back to from the page. Keeping this information requires merging the PDF objects rather then their pages.
26
+
27
+ Some links will be lost when ripping pages out of PDF files and merging them with another PDF.
28
+
29
+ * Some encrypted PDF files (usually the ones you can't view without a password) will fail quietly instead of noisily.
30
+
31
+ * Sometimes the CombinePDF will raise an exception even if the PDF could be parsed (i.e., when PDF optional content exists)... I find it better to err on the side of caution, although for optional content PDFs an exception is avoidable using `CombinePDF.load(pdf_file, allow_optional_content: true)`.
32
+
33
+ CombinePDF is written natively in Ruby and should (presumably) work on all Ruby platforms that follow Ruby 2.0 compatibility.
34
+
35
+ However, PDF files are quite complex creatures and no guaranty is provided.
36
+
37
+ For example, PDF Forms are known to have issues and form data might be lost when attempting to combine PDFs with filled form data (also, forms are global objects, not page specific, so one should combine the whole of the PDF for any data to have any chance of being preserved).
38
+
39
+ The same applies to PDF links and the table of contents, which all have global attributes and could be corrupted or lost when combining PDF data.
40
+
41
+ If this library causes loss of data or burns down your house, I'm not to blame - as pointed to by the MIT license. That being said, I'm using the library happily after testing against different solutions.
42
+
14
43
  ## Combine/Merge PDF files or Pages
15
44
 
16
45
  To combine PDF files (or data):
@@ -74,7 +103,7 @@ pdf.save "file_with_numbering.pdf"
74
103
 
75
104
  Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - see documentation.
76
105
 
77
- ## Loading and Rendering PDF data
106
+ ## Loading and Parsing PDF data
78
107
 
79
108
  Loading PDF data can be done from file system or directly from the memory.
80
109
 
@@ -91,14 +120,28 @@ pdf_data = prawn_pdf_document.render # Import PDF data from Prawn
91
120
  pdf = CombinePDF.parse(pdf_data)
92
121
  ```
93
122
 
94
- Similarly, you can output a string of PDF data using `.to_pdf`. For example, to let a user download the PDF from a [Rails](http://rubyonrails.org) or [Plezi](https://github.com/boazsegev/plezi) app:
123
+ Using `parse` is also effective when loading data from a remote location, circumventing the need for unnecessary temporary files. For example:
124
+
125
+ ```ruby
126
+ require 'combine_pdf'
127
+ require 'net/http'
128
+
129
+ url = "https://example.com/my.pdf"
130
+ pdf = CombinePDF.parse Net::HTTP.get_response(URI.parse(url)).body
131
+ ```
132
+
133
+ ## Rendering PDF data
134
+
135
+ Similarly, to loading and parsing, rendering can also be performed either to the memory or to a file.
136
+
137
+ You can output a string of PDF data using `.to_pdf`. For example, to let a user download the PDF from either a [Rails application](http://rubyonrails.org) or a [Plezi application](http://www.plezi.io):
95
138
 
96
139
  ```ruby
97
140
  # in a controller action
98
141
  send_data combined_file.to_pdf, filename: "combined.pdf", type: "application/pdf"
99
142
  ```
100
143
 
101
- Or in [Sinatra](http://www.sinatrarb.com):
144
+ In [Sinatra](http://www.sinatrarb.com):
102
145
 
103
146
  ```ruby
104
147
  # in your path's block
@@ -107,8 +150,19 @@ body combined_file.to_pdf
107
150
  headers 'content-type' => "application/pdf"
108
151
  ```
109
152
 
153
+
110
154
  If you prefer to save the PDF data to a file, you can always use the `save` method as we did in our earlier examples.
111
155
 
156
+ Some PDF files contain optional content sections which cannot always be merged reliably. By default, an exception is
157
+ raised if one of these files are detected. You can optionally pass an `allow_optional_content` parameter to the
158
+ `PDFParser.new`, `CombinePDF.load` and `CombinePDF.parse` methods:
159
+
160
+ ```ruby
161
+ new_pdf = CombinePDF.new
162
+ new_pdf << CombinePDF.load(pdf_file, allow_optional_content: true)
163
+ attachments.each { |att| new_pdf << CombinePDF.load(att, allow_optional_content: true) }
164
+ ```
165
+
112
166
  Demo
113
167
  ====
114
168
 
@@ -135,6 +189,8 @@ The code itself should be very straight forward, but feel free to ask whatever y
135
189
  Credit
136
190
  ======
137
191
 
192
+ Stefan Leitner (@sLe1tner) wrote the outline merging code supporting PDFs which contain a ToC.
193
+
138
194
  Caige Nichols wrote an amazing RC4 gem which I used in my code.
139
195
 
140
196
  I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
@@ -144,3 +200,12 @@ Credit to his wonderful is given here. Please respect his license and copyright.
144
200
  License
145
201
  =======
146
202
  MIT
203
+
204
+ Contributions
205
+ =======
206
+
207
+ You can look at the [GitHub Issues Page](https://github.com/boazsegev/combine_pdf/issues) and see the ["help wanted"](https://github.com/boazsegev/combine_pdf/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) tags.
208
+
209
+ If you're thinking of donations or sending me money - no need. This project can sustain itself without your money.
210
+
211
+ What this project needs is the time given by caring developers who keep it up to date and fix any documentation errors or issues they notice ... having said that, gifts (such as free coffee or iTunes gift cards) are always fun. But I think there are those in real need that will benefit more from your generosity.