combine_pdf 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 797001336a5b4f1598ae399bd69161f9f2b4b4ef
4
+ data.tar.gz: a684409037ef5205aff23512a8fe9a04ec53d828
5
+ SHA512:
6
+ metadata.gz: f74a67926f556606587211b4885e736cd9a8690c85aa62b56eb9deda4497a928ebb55a37ae4156f4ff748903c23b77e1aedb3f686a9c0a002201c6c6abe64afc
7
+ data.tar.gz: 3e73fd966be0fa30d5626d2216a5bc92bb133c26abd7648374991fadc8438f7cf13d6e549a264af6a8533ad110f76d9f244aa8443282440cd950d214e098d40d
@@ -0,0 +1,467 @@
1
+ # -*- encoding : utf-8 -*-
2
+ ########################################################
3
+ ## Thoughts from reading the ISO 32000-1:2008
4
+ ## this file is part of the CombinePDF library and the code
5
+ ## is subject to the same license (GPLv3).
6
+ ##
7
+ ##
8
+ ## === Merge PDFs!
9
+ ## This is a pure ruby library to merge PDF files.
10
+ ## In the future, this library will also allow stamping and watermarking PDFs (it allows this now, only with some issues).
11
+ ##
12
+ ## I started the project as a model within a RoR (Ruby on Rails) application, and as it grew I moved it to a local gem.
13
+ ## I fell in love with the project, even if it is still young and in the raw.
14
+ ## It is very simple to parse pdfs - from files:
15
+ ## >> pdf = CombinePDF.new "file_name.pdf"
16
+ ## or from data:
17
+ ## >> pdf = CombinePDF.parse "%PDF-1.4 .... [data]"
18
+ ## It's also easy to start an empty pdf:
19
+ ## >> pdf = CombinePDF.new
20
+ ## Merging is a breeze:
21
+ ## >> pdf << CombinePDF.new "another_file_name.pdf"
22
+ ## and saving the final PDF is a one-liner:
23
+ ## >> pdf.save "output_file_name.pdf"
24
+ ## Also, as a side effect, we can get all sorts of info about our pdf... such as the page count:
25
+ ## >> pdf.version # will tell you the PDF version (if discovered). you can also reset this manually.
26
+ ## >> pdf.pages.length # will tell you how much pages are actually displayed
27
+ ## >> pdf.all_pages.length # will tell you how many page objects actually exist (can be more or less then the pages displayed)
28
+ ## >> pdf.info # a hash with the Info dictionary from the PDF file (if discovered).
29
+ ## === Stamp PDF files
30
+ ## <b>has issues with specific PDF files - please see the issues</b>: https://github.com/boazsegev/combine_pdf/issues/2
31
+ ## You can use PDF files as stamps.
32
+ ## For instance, lets say you have this wonderful PDF (maybe one you created with prawn), and you want to stump the company header and footer on every page.
33
+ ## So you created your Prawn PDF file (Amazing library and hard work there, I totally recommend to have a look @ https://github.com/prawnpdf/prawn ):
34
+ ## >> prawn_pdf = Prawn::Document.new
35
+ ## >> ...(fill your new PDF with goodies)...
36
+ ## Stamping every page is a breeze.
37
+ ## We start by moving the PDF created by prawn into a CombinePDF object.
38
+ ## >> pdf = CombinePDF.parse prawn_pdf.render
39
+ ## Next we extract the stamp from our stamp pdf template:
40
+ ## >> pdf_stamp = CombinePDF.new "stamp_file_name.pdf"
41
+ ## >> stamp_page = pdf_stamp.pages[0]
42
+ ## And off we stamp each page:
43
+ ## >> pdf.pages.each {|page| pages << stamp_page}
44
+ ## Of cource, we can save the stamped output:
45
+ ## >> pdf.save "output_file_name.pdf"
46
+ ## === Decryption & Filters
47
+ ## Some PDF files are encrypted and some are compressed (the use of filters)...
48
+ ## There is very little support for encrypted files and very very basic and limited support for compressed files.
49
+ ## I need help with that.
50
+ ## === Comments and file structure
51
+ ## If you want to help with the code, please be aware:
52
+ ## I'm a self learned hobbiest at heart. The documentation is lacking and the comments in the code are poor guidlines.
53
+ ## The code itself should be very straight forward, but feel free to ask whatever you want.
54
+ ## === Credit
55
+ ## Caige Nichols wrote an amazing RC4 gem which I used in my code.
56
+ ## I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
57
+ ## Credit to his wonderful is given here. Please respect his license and copyright... and mine.
58
+ ## === License
59
+ ## GPLv3
60
+ ########################################################
61
+ require 'zlib'
62
+ require 'strscan'
63
+ require 'combine_pdf/combine_pdf_pdf'
64
+ require 'combine_pdf/combine_pdf_decrypt'
65
+ require 'combine_pdf/combine_pdf_filter'
66
+ require 'combine_pdf/combine_pdf_parser'
67
+ module CombinePDF
68
+ module_function
69
+ ################################################################
70
+ ## These are the "gateway" functions for the model.
71
+ ## These functions are open to the public.
72
+ ################################################################
73
+ # PDF object types cross reference:
74
+ # Indirect objects, references, dictionaries and streams are Hash
75
+ # arrays are Array
76
+ # strings are String
77
+ # names are Symbols (String.to_sym)
78
+ # numbers are Fixnum or Float
79
+ # boolean are TrueClass or FalseClass
80
+
81
+ def new(file_name = "")
82
+ raise TypeError, "couldn't parse and data, expecting type String" unless file_name.is_a? String
83
+ return PDF.new() if file_name == ''
84
+ PDF.new( PDFParser.new( IO.read(file_name).force_encoding(Encoding::ASCII_8BIT) ) )
85
+ end
86
+ def parse(data)
87
+ raise TypeError, "couldn't parse and data, expecting type String" unless data.is_a? String
88
+ PDF.new( PDFParser.new(data) )
89
+ end
90
+ end
91
+
92
+ module CombinePDF
93
+ ################################################################
94
+ ## These are common functions, used within the different classes
95
+ ## These functions aren't open to the public.
96
+ ################################################################
97
+ PRIVATE_HASH_KEYS = [:indirect_reference_id, :indirect_generation_number, :raw_stream_content, :is_reference_only, :referenced_object, :indirect_without_dictionary]
98
+ LITERAL_STRING_REPLACEMENT_HASH = {
99
+ 110 => 10, # "\\n".bytes = [92, 110] "\n".ord = 10
100
+ 114 => 13, #r
101
+ 116 => 9, #t
102
+ 98 => 8, #b
103
+ 102 => 255, #f
104
+ 40 => 40, #(
105
+ 41 => 41, #)
106
+ 92 => 92 #\
107
+ }
108
+ module PDFOperations
109
+ module_function
110
+ def inject_to_page page = {Type: :Page, MediaBox: [0,0,612.0,792.0], Resources: {}, Contents: []}, stream = nil, top = true
111
+ # make sure both the page reciving the new data and the injected page are of the correct data type.
112
+ return false unless page.is_a?(Hash) && stream.is_a?(Hash)
113
+
114
+ # following the reference chain and assigning a pointer to the correct Resouces object.
115
+ # (assignments of Strings, Arrays and Hashes are pointers in Ruby, unless the .dup method is called)
116
+ original_resources = page[:Resources]
117
+ if original_resources[:is_reference_only]
118
+ original_resources = original_resources[:referenced_object]
119
+ raise "Couldn't tap into resources dictionary, as it is a reference and isn't linked." unless original_resources
120
+ end
121
+ original_contents = page[:Contents]
122
+ original_contents = [original_contents] unless original_contents.is_a? Array
123
+
124
+ stream_resources = stream[:Resources]
125
+ if stream_resources[:is_reference_only]
126
+ stream_resources = stream_resources[:referenced_object]
127
+ raise "Couldn't tap into resources dictionary, as it is a reference and isn't linked." unless stream_resources
128
+ end
129
+ stream_contents = stream[:Contents]
130
+ stream_contents = [stream_contents] unless stream_contents.is_a? Array
131
+
132
+ # collect keys as objects - this is to make sure that
133
+ # we are working on the actual resource data, rather then references
134
+ flatten_resources_dictionaries stream_resources
135
+ flatten_resources_dictionaries original_resources
136
+
137
+ # injecting each of the values in the injected Page
138
+ stream_resources.each do |key, new_val|
139
+ unless PRIVATE_HASH_KEYS.include? key # keep CombinePDF structual data intact.
140
+ if original_resources[key].nil?
141
+ original_resources[key] = new_val
142
+ elsif original_resources[key].is_a?(Hash) && new_val.is_a?(Hash)
143
+ new_val.update original_resources[key] # make sure the old values are respected
144
+ original_resources[key].update new_val # transfer old and new values to the injected page
145
+ end #Do nothing if array - ot is the PROC array, which is an issue
146
+ end
147
+ end
148
+ original_resources[:ProcSet] = [:PDF, :Text, :ImageB, :ImageC, :ImageI] # this was recommended by the ISO. 32000-1:2008
149
+
150
+ if top # if this is a stamp (overlay)
151
+ page[:Contents] = original_contents
152
+ page[:Contents].push *stream_contents
153
+ else #if this was a watermark (underlay? would be lost if the page was scanned, as white might not be transparent)
154
+ page[:Contents] = stream_contents
155
+ page[:Contents].push *original_contents
156
+ end
157
+
158
+ page
159
+ end
160
+ # copy_and_secure_for_injection(page)
161
+ # - page is a page in the pages array, i.e. pdf.pages[0]
162
+ # takes a page object and:
163
+ # makes a deep copy of the page (Ruby defaults to pointers, so this will copy the memory).
164
+ # then it will rewrite the content stream with renamed resources, so as to avoid name conflicts.
165
+ def copy_and_secure_for_injection(page)
166
+ # copy page
167
+ new_page = create_deep_copy page
168
+
169
+ # initiate dictionary from old names to new names
170
+ names_dictionary = {}
171
+
172
+ # itirate through all keys that are name objects and give them new names (add to dic)
173
+ # this should be done for every dictionary in :Resources
174
+ # this is a few steps stage:
175
+
176
+ # 1. get resources object
177
+ resources = new_page[:Resources]
178
+ if resources[:is_reference_only]
179
+ resources = resources[:referenced_object]
180
+ raise "Couldn't tap into resources dictionary, as it is a reference and isn't linked." unless resources
181
+ end
182
+
183
+ # 2. establich direct access to dictionaries and remove reference values
184
+ flatten_resources_dictionaries resources
185
+
186
+ # 3. travel every dictionary to pick up names (keys), change them and add them to the dictionary
187
+ resources.each do |k,v|
188
+ if v.is_a?(Hash)
189
+ new_dictionary = {}
190
+ v.each do |old_key, value|
191
+ new_key = ("CombinePDF" + SecureRandom.urlsafe_base64(9)).to_sym
192
+ names_dictionary[old_key] = new_key
193
+ new_dictionary[new_key] = value
194
+ end
195
+ resources[k] = new_dictionary
196
+ end
197
+ end
198
+
199
+ # now that we have replaced the names in the resources dictionaries,
200
+ # it is time to replace the names inside the stream
201
+ # we will need to make sure we have access to the stream injected
202
+ # we will user PDFFilter.inflate_object
203
+ (new_page[:Contents].is_a?(Array) ? new_page[:Contents] : [new_page[:Contents] ]).each do |c|
204
+ stream = c[:referenced_object]
205
+ PDFFilter.inflate_object stream
206
+ names_dictionary.each do |old_key, new_key|
207
+ stream[:raw_stream_content].gsub! _object_to_pdf(old_key), _object_to_pdf(new_key) ##### PRAY(!) that the parsed datawill be correctly reproduced!
208
+ end
209
+ end
210
+
211
+ new_page
212
+ end
213
+ def flatten_resources_dictionaries(resources)
214
+ resources.each do |k,v|
215
+ if v.is_a?(Hash) && v[:is_reference_only]
216
+ if v[:referenced_object]
217
+ resources[k] = resources[k][:referenced_object].dup
218
+ resources[k].delete(:indirect_reference_id)
219
+ resources[k].delete(:indirect_generation_number)
220
+ elsif v[:indirect_without_dictionary]
221
+ resources[k] = resources[k][:indirect_without_dictionary]
222
+ end
223
+ end
224
+ end
225
+ end
226
+
227
+
228
+ # Ruby normally assigns pointes.
229
+ # noramlly:
230
+ # a = [1,2,3] # => [1,2,3]
231
+ # b = a # => [1,2,3]
232
+ # a << 4 # => [1,2,3,4]
233
+ # b # => [1,2,3,4]
234
+ # This method makes sure that the memory is copied instead of a pointer assigned.
235
+ # this works using recursion, so that arrays and hashes within arrays and hashes are also copied and not pointed to.
236
+ # One needs to be careful of infinit loops using this function.
237
+ def create_deep_copy object
238
+ if object.is_a?(Array)
239
+ return object.map { |e| create_deep_copy e }
240
+ elsif object.is_a?(Hash)
241
+ return {}.tap {|out| object.each {|k,v| out[create_deep_copy(k)] = create_deep_copy(v) unless k == :Parent} }
242
+ elsif object.is_a?(String)
243
+ return object.dup
244
+ else
245
+ return object # objects that aren't Strings, Arrays or Hashes (such as Symbols and Fixnums) aren't pointers in Ruby and are always copied.
246
+ end
247
+ end
248
+ def get_refernced_object(objects_array = [], reference_hash = {})
249
+ objects_array.each do |stored_object|
250
+ return stored_object if ( stored_object.is_a?(Hash) &&
251
+ reference_hash[:indirect_reference_id] == stored_object[:indirect_reference_id] &&
252
+ reference_hash[:indirect_generation_number] == stored_object[:indirect_generation_number] )
253
+ end
254
+ warn "didn't find reference #{reference_hash}"
255
+ nil
256
+ end
257
+ def change_references_to_actual_values(objects_array = [], hash_with_references = {})
258
+ hash_with_references.each do |k,v|
259
+ if v.is_a?(Hash) && v[:is_reference_only]
260
+ hash_with_references[k] = PDFOperations.get_refernced_object( objects_array, v)
261
+ hash_with_references[k] = hash_with_references[k][:indirect_without_dictionary] if hash_with_references[k].is_a?(Hash) && hash_with_references[k][:indirect_without_dictionary]
262
+ warn "Couldn't connect all values from references - didn't find reference #{hash_with_references}!!!" if hash_with_references[k] == nil
263
+ hash_with_references[k] = v unless hash_with_references[k]
264
+ end
265
+ end
266
+ hash_with_references
267
+ end
268
+ def change_connected_references_to_actual_values(hash_with_references = {})
269
+ if hash_with_references.is_a?(Hash)
270
+ hash_with_references.each do |k,v|
271
+ if v.is_a?(Hash) && v[:is_reference_only]
272
+ if v[:indirect_without_dictionary]
273
+ hash_with_references[k] = v[:indirect_without_dictionary]
274
+ elsif v[:referenced_object]
275
+ hash_with_references[k] = v[:referenced_object]
276
+ else
277
+ raise "Cannot change references to values, as they are disconnected!"
278
+ end
279
+ end
280
+ end
281
+ hash_with_references.each {|k, v| change_connected_references_to_actual_values(v) if v.is_a?(Hash) || v.is_a?(Array)}
282
+ elsif hash_with_references.is_a?(Array)
283
+ hash_with_references.each {|item| change_connected_references_to_actual_values(item) if item.is_a?(Hash) || item.is_a?(Array)}
284
+ end
285
+ hash_with_references
286
+ end
287
+ def connect_references_and_actual_values(objects_array = [], hash_with_references = {})
288
+ ret = true
289
+ hash_with_references.each do |k,v|
290
+ if v.is_a?(Hash) && v[:is_reference_only]
291
+ ref_obj = PDFOperations.get_refernced_object( objects_array, v)
292
+ hash_with_references[k] = ref_obj[:indirect_without_dictionary] if ref_obj.is_a?(Hash) && ref_obj[:indirect_without_dictionary]
293
+ ret = false
294
+ end
295
+ end
296
+ ret
297
+ end
298
+
299
+
300
+ def _each_object(object, limit_references = true, first_call = true, &block)
301
+ # #####################
302
+ # ## v.1.2 needs optimazation
303
+ # case
304
+ # when object.is_a?(Array)
305
+ # object.each {|obj| _each_object(obj, limit_references, &block)}
306
+ # when object.is_a?(Hash)
307
+ # yield(object)
308
+ # object.each do |k,v|
309
+ # unless (limit_references && k == :referenced_object)
310
+ # unless k == :Parent
311
+ # _each_object(v, limit_references, &block)
312
+ # end
313
+ # end
314
+ # end
315
+ # end
316
+ #####################
317
+ ## v.2.1 needs optimazation
318
+ ## version 2.1 is slightly faster then v.1.2
319
+ @already_visited = [] if first_call
320
+ unless limit_references
321
+ @already_visited << object.object_id
322
+ end
323
+ case
324
+ when object.is_a?(Array)
325
+ object.each {|obj| _each_object(obj, limit_references, false, &block)}
326
+ when object.is_a?(Hash)
327
+ yield(object)
328
+ unless limit_references && object[:is_reference_only]
329
+ object.each do |k,v|
330
+ _each_object(v, limit_references, false, &block) unless @already_visited.include? v.object_id
331
+ end
332
+ end
333
+ end
334
+ end
335
+
336
+
337
+
338
+ def _object_to_pdf object
339
+ case
340
+ when object.nil?
341
+ return "null"
342
+ when object.is_a?(String)
343
+ return _format_string_to_pdf object
344
+ when object.is_a?(Symbol)
345
+ return _format_name_to_pdf object
346
+ when object.is_a?(Array)
347
+ return _format_array_to_pdf object
348
+ when object.is_a?(Fixnum), object.is_a?(Float), object.is_a?(TrueClass), object.is_a?(FalseClass)
349
+ return object.to_s + " "
350
+ when object.is_a?(Hash)
351
+ return _format_hash_to_pdf object
352
+ else
353
+ return ''
354
+ end
355
+ end
356
+
357
+ def _format_string_to_pdf(object)
358
+ if @string_output == :literal #if format is set to Literal
359
+ #### can be better...
360
+ replacement_hash = {
361
+ "\x0A" => "\\n",
362
+ "\x0D" => "\\r",
363
+ "\x09" => "\\t",
364
+ "\x08" => "\\b",
365
+ "\xFF" => "\\f",
366
+ "\x28" => "\\(",
367
+ "\x29" => "\\)",
368
+ "\x5C" => "\\\\"
369
+ }
370
+ 32.times {|i| replacement_hash[i.chr] ||= "\\#{i}"}
371
+ (256-128).times {|i| replacement_hash[(i + 127).chr] ||= "\\#{i+127}"}
372
+ ("(" + ([].tap {|out| object.bytes.each {|byte| replacement_hash[ byte.chr ] ? (replacement_hash[ byte.chr ].bytes.each {|b| out << b}) : out << byte } }).pack('C*') + ")").force_encoding(Encoding::ASCII_8BIT)
373
+ else
374
+ # A hexadecimal string shall be written as a sequence of hexadecimal digits (0–9 and either A–F or a–f)
375
+ # encoded as ASCII characters and enclosed within angle brackets (using LESS-THAN SIGN (3Ch) and GREATER- THAN SIGN (3Eh)).
376
+ ("<" + object.unpack('H*')[0] + ">").force_encoding(Encoding::ASCII_8BIT)
377
+ end
378
+ end
379
+ def _format_name_to_pdf(object)
380
+ # a name object is an atomic symbol uniquely defined by a sequence of ANY characters (8-bit values) except null (character code 0).
381
+ # print name as a simple string. all characters between ~ and ! (except #) can be raw
382
+ # the rest will have a number sign and their HEX equivalant
383
+ # from the standard:
384
+ # When writing a name in a PDF file, a SOLIDUS (2Fh) (/) shall be used to introduce a name. The SOLIDUS is not part of the name but is a prefix indicating that what follows is a sequence of characters representing the name in the PDF file and shall follow these rules:
385
+ # a) A NUMBER SIGN (23h) (#) in a name shall be written by using its 2-digit hexadecimal code (23), preceded by the NUMBER SIGN.
386
+ # b) Any character in a name that is a regular character (other than NUMBER SIGN) shall be written as itself or by using its 2-digit hexadecimal code, preceded by the NUMBER SIGN.
387
+ # c) Any character that is not a regular character shall be written using its 2-digit hexadecimal code, preceded by the NUMBER SIGN only.
388
+ # [0x00, 0x09, 0x0a, 0x0c, 0x0d, 0x20, 0x28, 0x29, 0x3c, 0x3e, 0x5b, 0x5d, 0x7b, 0x7d, 0x2f, 0x25]
389
+ out = object.to_s.bytes.map do |b|
390
+ case b
391
+ when 0..15
392
+ '#0' + b.to_s(16)
393
+ when 15..32, 35, 37, 40, 41, 47, 60, 62, 91, 93, 123, 125, 127..256
394
+ '#' + b.to_s(16)
395
+ else
396
+ b.chr
397
+ end
398
+ end
399
+ "/" + out.join()
400
+ end
401
+ def _format_array_to_pdf(object)
402
+ # An array shall be written as a sequence of objects enclosed in SQUARE BRACKETS (using LEFT SQUARE BRACKET (5Bh) and RIGHT SQUARE BRACKET (5Dh)).
403
+ # EXAMPLE [549 3.14 false (Ralph) /SomeName]
404
+ ("[" + (object.collect {|item| _object_to_pdf(item)}).join(' ') + "]").force_encoding(Encoding::ASCII_8BIT)
405
+
406
+ end
407
+
408
+ def _format_hash_to_pdf(object)
409
+ # if the object is only a reference:
410
+ # special conditions apply, and there is only the setting of the reference (if needed) and output
411
+ if object[:is_reference_only]
412
+ #
413
+ if object[:referenced_object] && object[:referenced_object].is_a?(Hash)
414
+ object[:indirect_reference_id] = object[:referenced_object][:indirect_reference_id]
415
+ object[:indirect_generation_number] = object[:referenced_object][:indirect_generation_number]
416
+ end
417
+ object[:indirect_reference_id] ||= 0
418
+ object[:indirect_generation_number] ||= 0
419
+ return "#{object[:indirect_reference_id].to_s} #{object[:indirect_generation_number].to_s} R".force_encoding(Encoding::ASCII_8BIT)
420
+ end
421
+
422
+ # if the object is indirect...
423
+ out = []
424
+ if object[:indirect_reference_id]
425
+ object[:indirect_reference_id] ||= 0
426
+ object[:indirect_generation_number] ||= 0
427
+ out << "#{object[:indirect_reference_id].to_s} #{object[:indirect_generation_number].to_s} obj\n".force_encoding(Encoding::ASCII_8BIT)
428
+ if object[:indirect_without_dictionary]
429
+ out << _object_to_pdf(object[:indirect_without_dictionary])
430
+ out << "\nendobj\n"
431
+ return out.join().force_encoding(Encoding::ASCII_8BIT)
432
+ end
433
+ end
434
+ # correct stream length, if the object is a stream.
435
+ object[:Length] = object[:raw_stream_content].bytesize if object[:raw_stream_content]
436
+
437
+ # if the object is not a simple object, it is a dictionary
438
+ # A dictionary shall be written as a sequence of key-value pairs enclosed in double angle brackets (<<...>>)
439
+ # (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)).
440
+ out << "<<\n".force_encoding(Encoding::ASCII_8BIT)
441
+ object.each do |key, value|
442
+ out << "#{_object_to_pdf key} #{_object_to_pdf value}\n".force_encoding(Encoding::ASCII_8BIT) unless PRIVATE_HASH_KEYS.include? key
443
+ end
444
+ out << ">>".force_encoding(Encoding::ASCII_8BIT)
445
+ out << "\nstream\n#{object[:raw_stream_content]}\nendstream".force_encoding(Encoding::ASCII_8BIT) if object[:raw_stream_content]
446
+ out << "\nendobj\n" if object[:indirect_reference_id]
447
+ out.join().force_encoding(Encoding::ASCII_8BIT)
448
+ end
449
+
450
+
451
+
452
+ end
453
+ end
454
+
455
+
456
+ ## You can test performance with:
457
+ ## puts Benchmark.measure { pdf = CombinePDF.new(file_name); pdf.save "test.pdf" } # PDFEditor.new_pdf
458
+ ## demo: file_name = "/Users/2Be/Ruby/pdfs/encrypted.pdf"; pdf=0; puts Benchmark.measure { pdf = CombinePDF.new(file_name); pdf.save "test.pdf" }
459
+ ## at the moment... my code it terribly slow for larger files... :(
460
+ ## The file saving is solved (I hope)... but file loading is an issue.
461
+ ## pdf.each_object {|obj| puts "Stream length: #{obj[:raw_stream_content].length} was registered as #{obj[:Length].is_a?(Hash)? obj[:Length][:referenced_object][:indirect_without_dictionary] : obj[:Length]}" if obj[:raw_stream_content] }
462
+ ## pdf.objects.each {|obj| puts "#{obj.class.name}: #{obj[:indirect_reference_id]}, #{obj[:indirect_generation_number]} is: #{obj[:Type] || obj[:indirect_without_dictionary]}" }
463
+ ## puts Benchmark.measure { 1000.times { (CombinePDF::PDFOperations.get_refernced_object pdf.objects, {indirect_reference_id: 100, indirect_generation_number:0}).object_id } }
464
+ ## puts Benchmark.measure { 1000.times { (pdf.objects.select {|o| o[:indirect_reference_id]== 100 && o[:indirect_generation_number] == 0})[0].object_id } }
465
+ ## puts Benchmark.measure { {}.tap {|out| pdf.objects.each {|o| out[ [o[:indirect_reference_id], o[:indirect_generation_number] ] ] = o }} }
466
+
467
+