combine_pdf 0.2.4 → 0.2.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: bfa5b9792e629049a4ebf6a379b1e834f6f76eb7
4
- data.tar.gz: 0ea35cdbccc1fa1480370cccfa8e11351f6973d5
3
+ metadata.gz: 61c4d75ddd1975e567b4b96359f852a4eb13fdf7
4
+ data.tar.gz: 304f0b46cf41a96eddc46c728a20caf9ff77912c
5
5
  SHA512:
6
- metadata.gz: a5baf7554a0757a4666e22464ef0d6f515153a8ea7c6b1058edb92090f4f683808996b259539b5e2553ced8e74c1c6a78ef4bf947442ee131bb5be31c43d34f0
7
- data.tar.gz: 171cf6606fa1a4d91b9652f4d166bfa198ed6acd049631f0f50137362cbcf10a7c5d15b6191434a9bb9bc52b7eab9a00cafe07fb2edeb656bfba2f06eb71eb92
6
+ metadata.gz: 256814f346635778e23265a652ce800e3c951e21e647943f752b2d231ddc68c90cc6ffeb7cfacaadd34b42faff97233637752cbd68329ff064728d673c989fdf
7
+ data.tar.gz: 29c90166699f5baf305398a1382e2f0db4ceb00e35391955cb5f6a30012a445fd48f3ce5e2d5154851ad1b9ae8d580f286de71ffe8b881e580b051faa1e1ec13
@@ -2,6 +2,14 @@
2
2
 
3
3
  ***
4
4
 
5
+ Change log v.0.2.5
6
+
7
+ **feature**: circumvents an issue with 'wkhtmltopdf', where sometimes the `endobj` keyword would be missing, causing malformed PDF data. The parser will now attempt to auto-fix any `endobj` missing keywords.
8
+
9
+ **semi-fix**: make sure decryption is attempetd using actual values (vs. references). The code was updated for a similar result as should have been achived before.
10
+
11
+ ***
12
+
5
13
  Change log v.0.2.4
6
14
 
7
15
  **fixed**: Fixed the default page sizes which weren't as described in the documentation and now default to US Letter. The documentation was also fixed. No major version bump is declered since the defaults were faulty and weren't as described (fixed a bug, not changed the API).
data/README.md CHANGED
@@ -1,4 +1,6 @@
1
1
  # CombinePDF - the ruby way for merging PDF files
2
+ [![Gem Version](https://badge.fury.io/rb/combine_pdf.svg)](http://badge.fury.io/rb/combine_pdf)
3
+
2
4
  CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).
3
5
 
4
6
  # Install
@@ -72,7 +74,7 @@ pdf.save "file_with_numbering.pdf"
72
74
 
73
75
  Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - see documentation.
74
76
 
75
- ## Loading PDF data
77
+ ## Loading and Rendering PDF data
76
78
 
77
79
  Loading PDF data can be done from file system or directly from the memory.
78
80
 
@@ -82,19 +84,35 @@ Loading data from a file is easy:
82
84
  pdf = CombinePDF.load("file.pdf")
83
85
  ```
84
86
 
85
- you can also parse PDF files from memory:
87
+ You can also parse PDF files from memory. Loading from the memory is especially effective for importing PDF data recieved through the internet or from a different authoring library such as Prawn:
86
88
 
87
89
  ```ruby
88
- pdf_data = IO.read 'file.pdf' # for this demo, load a file to memory
90
+ pdf_data = prawn_pdf_document.render # Import PDF data from Prawn
89
91
  pdf = CombinePDF.parse(pdf_data)
90
92
  ```
91
93
 
92
- Loading from the memory is especially effective for importing PDF data recieved through the internet or from a different authoring library such as Prawn.
94
+ Similarly, you can output a string of PDF data using `.to_pdf`. For example, to let a user download the PDF from a [Rails](http://rubyonrails.org) or [Plezi](https://github.com/boazsegev/plezi) app:
95
+
96
+ ```ruby
97
+ # in a controller action
98
+ send_data combined_file.to_pdf, filename: "combined.pdf", type: "application/pdf"
99
+ ```
100
+
101
+ Or in [Sinatra](http://www.sinatrarb.com):
102
+
103
+ ```ruby
104
+ # in your path's block
105
+ status 200
106
+ body combined_file.to_pdf
107
+ headers 'content-type' => "application/pdf"
108
+ ```
109
+
110
+ If you prefer to save the PDF data to a file, you can always use the `save` method as we did in our earlier examples.
93
111
 
94
112
  Demo
95
113
  ====
96
114
 
97
- You can see a Demo for a ["Bates stumping web-app"](http://combine-pdf-demo.herokuapp.com/bates) and read through it's [code](http://combine-pdf-demo.herokuapp.com/code) . Good luck :)
115
+ You can see a Demo for a ["Bates stumping web-app"](http://combine-pdf-demo.herokuapp.com/bates) and read through it's [code](https://github.com/boazsegev/combine_pdf_demo/blob/c9914588e4116dcfdaa37f85727f442b064e2b04/pdf_controller.rb) . Good luck :)
98
116
 
99
117
  Decryption & Filters
100
118
  ====================
@@ -17,6 +17,7 @@ module CombinePDF
17
17
 
18
18
  # This is an internal class. you don't need it.
19
19
  class PDFDecrypt
20
+ include CombinePDF::Renderer
20
21
 
21
22
  # @!visibility private
22
23
 
@@ -25,9 +26,9 @@ module CombinePDF
25
26
  # root_dictionary:: the root PDF dictionary, containing the Encrypt dictionary.
26
27
  def initialize objects=[], root_dictionary = {}
27
28
  @objects = objects
28
- @encryption_dictionary = root_dictionary[:Encrypt]
29
+ @encryption_dictionary = actual_object(root_dictionary[:Encrypt])
29
30
  raise "Cannot decrypt an encrypted file without an encryption dictionary!" unless @encryption_dictionary
30
- @root_dictionary = root_dictionary
31
+ @root_dictionary = actual_object(root_dictionary)
31
32
  @padding_key = [ 0x28, 0xBF, 0x4E, 0x5E, 0x4E, 0x75, 0x8A, 0x41,
32
33
  0x64, 0x00, 0x4E, 0x56, 0xFF, 0xFA, 0x01, 0x08,
33
34
  0x2E, 0x2E, 0x00, 0xB6, 0xD0, 0x68, 0x3E, 0x80,
@@ -41,7 +42,7 @@ module CombinePDF
41
42
  def decrypt
42
43
  raise_encrypted_error @encryption_dictionary unless @encryption_dictionary[:Filter] == :Standard
43
44
  @key = set_general_key
44
- case @encryption_dictionary[:V]
45
+ case actual_object(@encryption_dictionary[:V])
45
46
  when 1,2
46
47
  # raise_encrypted_error
47
48
  _perform_decrypt_proc_ @objects, self.method(:decrypt_RC4)
@@ -49,10 +50,10 @@ module CombinePDF
49
50
  # raise unsupported error for now
50
51
  raise_encrypted_error
51
52
  # make sure CF is a Hash (as required by the PDF standard for this type of encryption).
52
- raise_encrypted_error unless @encryption_dictionary[:CF].is_a?(Hash)
53
+ raise_encrypted_error unless actual_object(@encryption_dictionary[:CF]).is_a?(Hash)
53
54
 
54
55
  # do nothing if there is no data to decrypt except embeded files...?
55
- return true unless (@encryption_dictionary[:CF].values.select { |v| !v[:AuthEvent] || v[:AuthEvent] == :DocOpen } ).empty?
56
+ return true unless (actual_object(@encryption_dictionary[:CF]).values.select { |v| !v[:AuthEvent] || v[:AuthEvent] == :DocOpen } ).empty?
56
57
 
57
58
  # attempt to decrypt all strings?
58
59
  # attempt to decrypy all streams
@@ -63,6 +64,8 @@ module CombinePDF
63
64
  end
64
65
  #rebuild stream lengths?
65
66
  @objects
67
+ rescue => e
68
+ raise_encrypted_error
66
69
  end
67
70
 
68
71
  protected
@@ -71,17 +74,17 @@ module CombinePDF
71
74
  # 1) make sure the initial key is 32 byte long (if no password, uses padding).
72
75
  key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').force_encoding(Encoding::ASCII_8BIT)
73
76
  # 2) add the value of the encryption dictionary’s O entry
74
- key << @encryption_dictionary[:O].to_s
77
+ key << actual_object(@encryption_dictionary[:O]).to_s
75
78
  # 3) Convert the integer value of the P entry to a 32-bit unsigned binary number
76
79
  # and pass these bytes low-order byte first
77
- key << [@encryption_dictionary[:P]].pack('i')
80
+ key << [actual_object(@encryption_dictionary[:P])].pack('i')
78
81
  # 4) Pass the first element of the file’s file identifier array
79
82
  # (the value of the ID entry in the document’s trailer dictionary
80
- key << @root_dictionary[:ID][0]
83
+ key << actual_object(@root_dictionary[:ID])[0]
81
84
  # # 4(a) (Security handlers of revision 4 or greater)
82
85
  # # if document metadata is not being encrypted, add 4 bytes with the value 0xFFFFFFFF.
83
- if @encryption_dictionary[:R] >= 4
84
- unless @encryption_dictionary[:EncryptMetadata] == false #default is true and nil != false
86
+ if actual_object(@encryption_dictionary[:R]) >= 4
87
+ unless actual_object(@encryption_dictionary)[:EncryptMetadata] == false #default is true and nil != false
85
88
  key << "\x00\x00\x00\x00"
86
89
  else
87
90
  key << "\xFF\xFF\xFF\xFF"
@@ -94,17 +97,17 @@ module CombinePDF
94
97
  # pass the first n bytes of the output as input into a new MD5 hash,
95
98
  # where n is the number of bytes of the encryption key as defined by the value of
96
99
  # the encryption dictionary’s Length entry.
97
- if @encryption_dictionary[:R] >= 3
100
+ if actual_object(@encryption_dictionary[:R]) >= 3
98
101
  50.times do|i|
99
- key = Digest::MD5.digest(key[0...@encryption_dictionary[:Length]])
102
+ key = Digest::MD5.digest(key[0...actual_object(@encryption_dictionary[:Length])])
100
103
  end
101
104
  end
102
105
  # 6) Set the encryption key to the first n bytes of the output from the final MD5 hash,
103
106
  # where n shall always be 5 for security handlers of revision 2 but,
104
107
  # for security handlers of revision 3 or greater,
105
108
  # shall depend on the value of the encryption dictionary’s Length entry.
106
- if @encryption_dictionary[:R] >= 3
107
- @key = key[0..(@encryption_dictionary[:Length]/8)]
109
+ if actual_object(@encryption_dictionary[:R]) >= 3
110
+ @key = key[0..(actual_object(@encryption_dictionary[:Length])/8)]
108
111
  else
109
112
  @key = key[0..4]
110
113
  end
@@ -150,14 +153,11 @@ module CombinePDF
150
153
  when object.is_a?(Array)
151
154
  object.map! { |item| _perform_decrypt_proc_(item, decrypt_proc, encrypted_id, encrypted_generation, encrypted_filter) }
152
155
  when object.is_a?(Hash)
153
- encrypted_id ||= object[:indirect_reference_id]
154
- encrypted_generation ||= object[:indirect_generation_number]
155
- encrypted_filter ||= object[:Filter]
156
+ encrypted_id ||= actual_object(object[:indirect_reference_id])
157
+ encrypted_generation ||= actual_object(object[:indirect_generation_number])
158
+ encrypted_filter ||= actual_object(object[:Filter])
156
159
  if object[:raw_stream_content]
157
- stream_length = object[:Length]
158
- if stream_length.is_a?(Hash) && stream_length[:is_reference_only]
159
- stream_length = get_refernced_object(stream_length)[:indirect_without_dictionary]
160
- end
160
+ stream_length = actual_object(object[:Length])
161
161
  actual_length = object[:raw_stream_content].length
162
162
  warn "Stream registeded length was #{object[:Length].to_s} and the actual length was #{actual_length}." if actual_length < stream_length
163
163
  length = [ stream_length, actual_length].min
@@ -331,10 +331,26 @@ module CombinePDF
331
331
  @scanner.skip_until(/\%\%EOF/)
332
332
  end
333
333
 
334
- when @scanner.scan(/[\s]+/) , @scanner.scan(/obj[\s]*/)
335
- # do nothing
336
- # warn "White Space, do nothing"
334
+ when @scanner.scan(/[\s]+/)
335
+ # Generally, do nothing
337
336
  nil
337
+ when @scanner.scan(/obj[\s]*/)
338
+ # Fix wkhtmltopdf PDF authoring issue - missing 'endobj' keywords
339
+ unless out[-4].nil? || out[-4].is_a?(Hash)
340
+ keep = []
341
+ keep << out.pop
342
+ keep << out.pop
343
+
344
+ if out.last.is_a? Hash
345
+ out << out.pop.merge({indirect_generation_number: out.pop, indirect_reference_id: out.pop})
346
+ else
347
+ out << {indirect_without_dictionary: out.pop, indirect_generation_number: out.pop, indirect_reference_id: out.pop}
348
+ end
349
+ warn "'endobj' keyword was missing for Object ID: #{out.last[:indirect_reference_id]}, trying to auto-fix issue, but might fail."
350
+
351
+ out << keep.pop
352
+ out << keep.pop
353
+ end
338
354
  else
339
355
  # always advance
340
356
  # warn "Advnacing for unknown reason..."
@@ -454,6 +470,11 @@ module CombinePDF
454
470
  obj.delete(:indirect_reference_id); obj.delete(:indirect_generation_number)
455
471
  end
456
472
  self
473
+ # rescue => e
474
+ # puts (@parsed.select {|o| !o.is_a?(Hash)})
475
+ # puts (@parsed)
476
+ # puts (@references)
477
+ # raise e
457
478
  end
458
479
 
459
480
  # @private
@@ -1,4 +1,4 @@
1
1
  module CombinePDF
2
- VERSION = "0.2.4"
2
+ VERSION = "0.2.5"
3
3
  end
4
4
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: combine_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.4
4
+ version: 0.2.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Boaz Segev
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-08-09 00:00:00.000000000 Z
11
+ date: 2015-09-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ruby-rc4
@@ -102,9 +102,10 @@ required_rubygems_version: !ruby/object:Gem::Requirement
102
102
  version: '0'
103
103
  requirements: []
104
104
  rubyforge_project:
105
- rubygems_version: 2.4.7
105
+ rubygems_version: 2.4.5
106
106
  signing_key:
107
107
  specification_version: 4
108
108
  summary: Combine, stamp and watermark PDF files in pure Ruby.
109
109
  test_files:
110
110
  - test/console
111
+ has_rdoc: