combine_pdf 1.0.16 → 1.0.22

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d071703a8903bc5a11d7beb5a0bcb284a8bced5763705de3a7dd9ae6be35b64d
4
- data.tar.gz: 898069ad7ec79ad4fadea5383a721f28c5dfc3632fc20aaa40624954bfe70170
3
+ metadata.gz: 96825d0aa74bd673883c4d7dbf3884459ff27c2a3d7bd0c60875e0499c7b9aeb
4
+ data.tar.gz: 985c39883f343bb5182344ccc31353103fbac89494000362973f08cdd379d2ac
5
5
  SHA512:
6
- metadata.gz: d52868ff7a021801207ff17a1e87aca8fa1bd82e6fd0dacf91d39d57695b07b87238898238fa90755bfaaec65cb4df384b6db97f3f4e3bf70da7d3d808e55daa
7
- data.tar.gz: f68133d14d5eb5f0428097b4421976da49471eed61f4ac0e574c4718b588458f6971fd107ccc397be004997f3c31ac6bfc122675992412fd521a7910d6f2abd9
6
+ metadata.gz: 8575b612e1eb31775833faba8f310d84680d6ce27512d6a9c182e7598a743956da34e0321f0280d032e3e46b861dd2abdd88b297e65a652ec8e3e416ed9fb0a0
7
+ data.tar.gz: 2026d924120f1798681842fee7a2eb0de78be6ac493dcbd4ffbb934c1c0135161ccbf29283fb0eec42b4ebab66f84b7fa3ac354a970fad9bc0ad302f64da7c7a
data/CHANGELOG.md CHANGED
@@ -2,6 +2,36 @@
2
2
 
3
3
  ***
4
4
 
5
+ #### Change log v.1.0.22
6
+
7
+ **Fix**: fix `fonts` dereferencing issue (#203), credit to @MarcWeber (Marc Weber) for identifying the issue.
8
+
9
+ **Fix**: fix `metrix` dependency, credit to @casperisfine (Jean byroot Boussier) for PR #195.
10
+
11
+ #### Change log v.1.0.21
12
+
13
+ **Fix**: possible fix for issue #184, where nested PDF files within an object stream could break the parser. Credit to Greg Sparrow (@hazelsparrow) for exposng the issue.
14
+
15
+ #### Change log v.1.0.20
16
+
17
+ **Fix**: merges PR #180, `TypeError: can't dup NilClass`. Credit to Adam Trepanier (@adam-e-trepanier) for the merge.
18
+
19
+ #### Change log v.1.0.19
20
+
21
+ **Fix**: fixes font height and width detection issue. Issue #179. Credit to @5anchezzz for opening the issue.
22
+
23
+ **Fix**: fixes an indentation warning. Issue #173. Credit to @rubyFeedback for exposing this issue.
24
+
25
+ #### Change log v.1.0.18
26
+
27
+ **Fix**: fixed issue with the 1.0.17 release where `ProcSet` PDF Arrays should have been expected but where ignored and a PDF Object was assumed instead (issue #171) - credit to @chuchiperriman (Jesús Barbero Rodríguez).
28
+
29
+ #### Change log v.1.0.17
30
+
31
+ NB: yanked from RubyGems.org.
32
+
33
+ **Fix**: fixed issue where nested structure equality tests might provide false positives, resulting in lost data (issue #166) - credit to @cschilbe (Conrad Schilbe).
34
+
5
35
  #### Change log v.1.0.16
6
36
 
7
37
  **Fix**: some documentation typos were fixed (PR #147) - credit to @djhopper01 (Derek Hopper).
data/README.md CHANGED
@@ -1,8 +1,10 @@
1
1
  # CombinePDF - the ruby way for merging PDF files
2
2
  [![Gem Version](https://badge.fury.io/rb/combine_pdf.svg)](http://badge.fury.io/rb/combine_pdf)
3
3
  [![GitHub](https://img.shields.io/badge/GitHub-Open%20Source-blue.svg)](https://github.com/boazsegev/combine_pdf)
4
+ [![Documentation](http://inch-ci.org/github/boazsegev/combine_pdf.svg?branch=master)](https://www.rubydoc.info/github/boazsegev/combine_pdf)
4
5
  [![Maintainers Wanted](https://img.shields.io/badge/maintainers-wanted-red.svg)](https://github.com/pickhardt/maintainers-wanted)
5
6
 
7
+
6
8
  CombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).
7
9
 
8
10
  ## Install
@@ -41,6 +43,8 @@ Quick rundown:
41
43
 
42
44
  * Sometimes the CombinePDF will raise an exception even if the PDF could be parsed (i.e., when PDF optional content exists)... I find it better to err on the side of caution, although for optional content PDFs an exception is avoidable using `CombinePDF.load(pdf_file, allow_optional_content: true)`.
43
45
 
46
+ * The CombinePDF gem runs recursive code to both parse and format the PDF files. Hence, PDF files that have heavily nested objects, as well as those that where combined in a way that results in cyclic nesting, might explode the stack - resulting in an exception or program failure.
47
+
44
48
  CombinePDF is written natively in Ruby and should (presumably) work on all Ruby platforms that follow Ruby 2.0 compatibility.
45
49
 
46
50
  However, PDF files are quite complex creatures and no guaranty is provided.
@@ -112,7 +116,42 @@ pdf.number_pages
112
116
  pdf.save "file_with_numbering.pdf"
113
117
  ```
114
118
 
115
- Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - see documentation.
119
+ Numbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - [see documentation](https://www.rubydoc.info/github/boazsegev/combine_pdf/CombinePDF/PDF#number_pages-instance_method).
120
+
121
+ For example, should you prefer to place the page number on the bottom right side of all PDF pages, do:
122
+
123
+ ```ruby
124
+ pdf.number_pages(location: [:bottom_right])
125
+ ```
126
+
127
+ As another example, the dashes around the number are removed and a box is placed around it. The numbering is semi-transparent and the first 3 pages are numbered using letters (a,b,c) rather than numbers:
128
+
129
+
130
+ ```ruby
131
+ # number first 3 pages as "a", "b", "c"
132
+ pdf.number_pages(number_format: " %s ",
133
+ location: [:top, :bottom, :top_left, :top_right, :bottom_left, :bottom_right],
134
+ start_at: "a",
135
+ page_range: (0..2),
136
+ box_color: [0.8,0.8,0.8],
137
+ border_color: [0.4, 0.4, 0.4],
138
+ border_width: 1,
139
+ box_radius: 6,
140
+ opacity: 0.75)
141
+ # number the rest of the pages as 4, 5, ... etc'
142
+ pdf.number_pages(number_format: " %s ",
143
+ location: [:top, :bottom, :top_left, :top_right, :bottom_left, :bottom_right],
144
+ start_at: 4,
145
+ page_range: (3..-1),
146
+ box_color: [0.8,0.8,0.8],
147
+ border_color: [0.4, 0.4, 0.4],
148
+ border_width: 1,
149
+ box_radius: 6,
150
+ opacity: 0.75)
151
+ ```
152
+
153
+ pdf.number_pages(number_format: " %s ", location: :bottom_right, font_size: 44)
154
+
116
155
 
117
156
  ## Loading and Parsing PDF data
118
157
 
data/combine_pdf.gemspec CHANGED
@@ -19,8 +19,9 @@ Gem::Specification.new do |spec|
19
19
  spec.require_paths = ["lib"]
20
20
 
21
21
  spec.add_runtime_dependency 'ruby-rc4', '>= 0.1.5'
22
+ spec.add_runtime_dependency 'matrix'
22
23
 
23
24
  # spec.add_development_dependency "bundler", ">= 1.7"
24
- spec.add_development_dependency "rake", "~> 10.0"
25
+ spec.add_development_dependency "rake", ">= 12.3.3"
25
26
  spec.add_development_dependency "minitest"
26
27
  end
@@ -24,11 +24,11 @@ module CombinePDF
24
24
  raise TypeError, "couldn't create PDF object, expecting type String" unless string.is_a?(String) || string.is_a?(Pathname)
25
25
  begin
26
26
  (begin
27
- File.file? string
28
- rescue
29
- false
30
- end) ? load(string) : parse(string)
31
- rescue => _e
27
+ File.file? string
28
+ rescue
29
+ false
30
+ end) ? load(string) : parse(string)
31
+ rescue => _e
32
32
  raise 'General PDF error - Use CombinePDF.load or CombinePDF.parse for a non-general error message (the requested file was not found OR the string received is not a valid PDF stream OR the file was found but not valid).'
33
33
  end
34
34
  end
@@ -138,12 +138,21 @@ module CombinePDF
138
138
  text.each_char do |c|
139
139
  metrics_array << (merged_metrics[c] || { wx: 0, boundingbox: [0, 0, 0, 0] })
140
140
  end
141
- height = metrics_array.map { |m| m ? m[:boundingbox][3] : 0 } .max
142
- height -= (metrics_array.map { |m| m ? m[:boundingbox][1] : 0 }).min
141
+ metrics_array_mapped_top = [].dup
142
+ metrics_array_mapped_bottom = [].dup
143
143
  width = 0.0
144
144
  metrics_array.each do |m|
145
- width += (m[:wx] || m[:wy])
145
+ if (m && m[:boundingbox])
146
+ metrics_array_mapped_top << m[:boundingbox][3]
147
+ metrics_array_mapped_bottom << m[:boundingbox][1]
148
+ else
149
+ metrics_array_mapped_top << 0
150
+ metrics_array_mapped_bottom << 0
151
+ end
152
+ width += (m[:wx] || m[:wy] || 0) if m
146
153
  end
154
+ height = metrics_array_mapped_top.max
155
+ height -=metrics_array_mapped_bottom.min
147
156
  return [height.to_f / 1000 * size, width.to_f / 1000 * size] if metrics_array[0][:wy]
148
157
  [width.to_f / 1000 * size, height.to_f / 1000 * size]
149
158
  end
@@ -94,7 +94,7 @@ module CombinePDF
94
94
  # end
95
95
 
96
96
  # set ProcSet to recommended value
97
- resources[:ProcSet] = [:PDF, :Text, :ImageB, :ImageC, :ImageI] # this was recommended by the ISO. 32000-1:2008
97
+ resources[:ProcSet] ||= [:PDF, :Text, :ImageB, :ImageC, :ImageI] # this was recommended by the ISO. 32000-1:2008
98
98
 
99
99
  if top # if this is a stamp (overlay)
100
100
  insert_content CONTENT_CONTAINER_START, 0
@@ -147,7 +147,7 @@ module CombinePDF
147
147
 
148
148
  # This method adds a simple text box to the Page represented by the PDFWriter class.
149
149
  # This function takes two values:
150
- # text:: the text to potin the box.
150
+ # text:: the text to write in the box.
151
151
  # properties:: a Hash of box properties.
152
152
  # the symbols and values in the properties Hash could be any or all of the following:
153
153
  # x:: the left position of the box.
@@ -233,16 +233,18 @@ module CombinePDF
233
233
  # all characters that aren't white space or special: /[^\x00\x09\x0a\x0c\x0d\x20\x28\x29\x3c\x3e\x5b\x5d\x7b\x7d\x2f\x25]+
234
234
  elsif str = @scanner.scan(/\/[^\x00\x09\x0a\x0c\x0d\x20\x28\x29\x3c\x3e\x5b\x5d\x7b\x7d\x2f\x25]*/)
235
235
  out << (str[1..-1].gsub(/\#[0-9a-fA-F]{2}/) { |a| a[1..2].hex.chr }).to_sym
236
+ # warn "CombinePDF detected name: #{out.last.to_s}"
236
237
  ##########################################
237
238
  ## Parse a Number
238
239
  ##########################################
239
240
  elsif str = @scanner.scan(/[\+\-\.\d]+/)
240
241
  str =~ /\./ ? (out << str.to_f) : (out << str.to_i)
242
+ # warn "CombinePDF detected number: #{out.last.to_s}"
241
243
  ##########################################
242
244
  ## parse a Hex String
243
245
  ##########################################
244
246
  elsif str = @scanner.scan(/\<[0-9a-fA-F]*\>/)
245
- # warn "Found a hex string"
247
+ # warn "Found a hex string #{str}"
246
248
  str = str.slice(1..-2).force_encoding(Encoding::ASCII_8BIT)
247
249
  # str = "0#{str}" if str.length.odd?
248
250
  out << unify_string([str].pack('H*').force_encoding(Encoding::ASCII_8BIT))
@@ -336,6 +338,7 @@ module CombinePDF
336
338
  end
337
339
  end
338
340
  out << unify_string(str.pack('C*').force_encoding(Encoding::ASCII_8BIT))
341
+ # warn "Found Literal String: #{out.last}"
339
342
  ##########################################
340
343
  ## parse a Dictionary
341
344
  ##########################################
@@ -348,29 +351,42 @@ module CombinePDF
348
351
  ## return content of array or dictionary
349
352
  ##########################################
350
353
  elsif @scanner.scan(/\]/) || @scanner.scan(/>>/)
354
+ # warn "Dictionary / Array ended with #{@scanner.peek(5)}"
351
355
  return out
352
356
  ##########################################
353
357
  ## parse a Stream
354
358
  ##########################################
355
359
  elsif @scanner.scan(/stream[ \t]*[\r\n]/)
356
360
  @scanner.pos += 1 if @scanner.peek(1) == "\n".freeze && @scanner.matched[-1] != "\n".freeze
361
+ # advance by the publshed stream length (if any)
362
+ old_pos = @scanner.pos
363
+ if(out.last.is_a?(Hash) && out.last[:Length].is_a?(Integer) && out.last[:Length] > 2)
364
+ @scanner.pos += out.last[:Length] - 2
365
+ end
366
+
357
367
  # the following was dicarded because some PDF files didn't have an EOL marker as required
358
368
  # str = @scanner.scan_until(/(\r\n|\r|\n)endstream/)
359
369
  # instead, a non-strict RegExp is used:
360
- str = @scanner.scan_until(/endstream/)
370
+
361
371
 
362
372
  # raise error if the stream doesn't end.
363
- unless str
373
+ unless @scanner.skip_until(/endstream/)
364
374
  raise ParsingError, "Parsing Error: PDF file error - a stream object wasn't properly closed using 'endstream'!"
365
375
  end
376
+ length = @scanner.pos - (old_pos + 9)
377
+ length = 0 if(length < 0)
378
+ length -= 1 if(@scanner.string[old_pos + length - 1] == "\n")
379
+ length -= 1 if(@scanner.string[old_pos + length - 1] == "\r")
380
+ str = (length > 0) ? @scanner.string.slice(old_pos, length) : ''
381
+
382
+ # warn "CombinePDF parser: detected Stream #{str.length} bytes long #{str[0..3]}...#{str[-4..-1]}"
366
383
 
367
384
  # need to remove end of stream
368
385
  if out.last.is_a? Hash
369
- # out.last[:raw_stream_content] = str[0...-10] #cuts only one EON char (\n or \r)
370
- out.last[:raw_stream_content] = unify_string str.sub(/(\r\n|\n|\r)?endstream\z/, '').force_encoding(Encoding::ASCII_8BIT)
386
+ out.last[:raw_stream_content] = unify_string str.force_encoding(Encoding::ASCII_8BIT)
371
387
  else
372
388
  warn 'Stream not attached to dictionary!'
373
- out << str.sub(/(\r\n|\n|\r)?endstream\z/, '').force_encoding(Encoding::ASCII_8BIT)
389
+ out << str.force_encoding(Encoding::ASCII_8BIT)
374
390
  end
375
391
  ##########################################
376
392
  ## parse an Object after finished
@@ -528,6 +544,14 @@ module CombinePDF
528
544
  inheritance_hash[:Resources] ||= { referenced_object: {}, is_reference_only: true }.dup
529
545
  (inheritance_hash[:Resources][:referenced_object] || inheritance_hash[:Resources]).update((catalogs[:Resources][:referenced_object] || catalogs[:Resources]), &HASH_UPDATE_PROC_FOR_OLD)
530
546
  end
547
+ if catalogs[:ProcSet].is_a?(Array)
548
+ if(inheritance_hash[:ProcSet])
549
+ inheritance_hash[:ProcSet][:referenced_object].concat(catalogs[:ProcSet])
550
+ inheritance_hash[:ProcSet][:referenced_object].uniq!
551
+ else
552
+ inheritance_hash[:ProcSet] ||= { referenced_object: catalogs[:ProcSet], is_reference_only: true }.dup
553
+ end
554
+ end
531
555
  if catalogs[:ColorSpace]
532
556
  inheritance_hash[:ColorSpace] ||= { referenced_object: {}, is_reference_only: true }.dup
533
557
  (inheritance_hash[:ColorSpace][:referenced_object] || inheritance_hash[:ColorSpace]).update((catalogs[:ColorSpace][:referenced_object] || catalogs[:ColorSpace]), &HASH_UPDATE_PROC_FOR_OLD)
@@ -556,6 +580,18 @@ module CombinePDF
556
580
  catalogs[:ColorSpace] = { referenced_object: catalogs[:ColorSpace], is_reference_only: true } unless catalogs[:ColorSpace][:referenced_object]
557
581
  catalogs[:ColorSpace][:referenced_object].update((inheritance_hash[:ColorSpace][:referenced_object] || inheritance_hash[:ColorSpace]), &HASH_UPDATE_PROC_FOR_OLD)
558
582
  end
583
+ if inheritance_hash[:ProcSet]
584
+ if(catalogs[:ProcSet])
585
+ if catalogs[:ProcSet].is_a?(Array)
586
+ catalogs[:ProcSet] = { referenced_object: catalogs[:ProcSet], is_reference_only: true }
587
+ end
588
+ catalogs[:ProcSet][:referenced_object].concat(inheritance_hash[:ProcSet][:referenced_object])
589
+ catalogs[:ProcSet][:referenced_object].uniq!
590
+ else
591
+ catalogs[:ProcSet] = { is_reference_only: true }.dup
592
+ catalogs[:ProcSet][:referenced_object] = []
593
+ end
594
+ end
559
595
  # (catalogs[:ColorSpace] ||= {}).update(inheritance_hash[:ColorSpace], &HASH_UPDATE_PROC_FOR_OLD) if inheritance_hash[:ColorSpace]
560
596
  # catalogs[:Order] ||= inheritance_hash[:Order] if inheritance_hash[:Order]
561
597
  # catalogs[:AS] ||= inheritance_hash[:AS] if inheritance_hash[:AS]
@@ -373,16 +373,19 @@ module CombinePDF
373
373
  private
374
374
 
375
375
  def equal_layers obj1, obj2, layer = CombinePDF.eq_depth_limit
376
- return true if(layer == 0)
377
376
  return true if obj1.object_id == obj2.object_id
378
377
  if obj1.is_a? Hash
379
378
  return false unless obj2.is_a? Hash
379
+ return false unless obj1.length == obj2.length
380
380
  keys = obj1.keys;
381
- return false if (keys - obj2.keys).any?
381
+ keys2 = obj2.keys;
382
+ return false if (keys - keys2).any? || (keys2 - keys).any?
383
+ return (warn("CombinePDF nesting limit reached") || true) if(layer == 0)
382
384
  keys.each {|k| return false unless equal_layers( obj1[k], obj2[k], layer-1) }
383
385
  elsif obj1.is_a? Array
384
386
  return false unless obj2.is_a? Array
385
- (obj1-obj2).any?
387
+ return false unless obj1.length == obj2.length
388
+ (obj1-obj2).any? || (obj2-obj1).any?
386
389
  else
387
390
  obj1 == obj2
388
391
  end
@@ -257,12 +257,16 @@ module CombinePDF
257
257
  def fonts(limit_to_type0 = false)
258
258
  fonts_array = []
259
259
  pages.each do |pg|
260
- if pg[:Resources][:Font]
261
- pg[:Resources][:Font].values.each do |f|
262
- f = f[:referenced_object] if f[:referenced_object]
263
- if (limit_to_type0 || f[:Subtype] == :Type0) && f[:Type] == :Font && !fonts_array.include?(f)
264
- fonts_array << f
265
- end
260
+ r = pg[:Resources]
261
+ next if !r
262
+ r = r[:referenced_object] if r[:referenced_object]
263
+ r = r[:Font]
264
+ next if !r
265
+ r = r[:referenced_object] if r[:referenced_object]
266
+ r.values.each do |f|
267
+ f = f[:referenced_object] if f[:referenced_object]
268
+ if (limit_to_type0 || f[:Subtype] == :Type0) && f[:Type] == :Font && !fonts_array.include?(f)
269
+ fonts_array << f
266
270
  end
267
271
  end
268
272
  end
@@ -1,3 +1,3 @@
1
1
  module CombinePDF
2
- VERSION = '1.0.16'.freeze
2
+ VERSION = '1.0.22'.freeze
3
3
  end
data/lib/combine_pdf.rb CHANGED
@@ -5,6 +5,7 @@ require 'securerandom'
5
5
  require 'strscan'
6
6
  require 'matrix'
7
7
  require 'set'
8
+ require 'digest'
8
9
 
9
10
  # require the RC4 Gem
10
11
  require 'rc4'
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: combine_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.16
4
+ version: 1.0.22
5
5
  platform: ruby
6
6
  authors:
7
7
  - Boaz Segev
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-02-22 00:00:00.000000000 Z
11
+ date: 2021-11-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ruby-rc4
@@ -24,20 +24,34 @@ dependencies:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
26
  version: 0.1.5
27
+ - !ruby/object:Gem::Dependency
28
+ name: matrix
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
27
41
  - !ruby/object:Gem::Dependency
28
42
  name: rake
29
43
  requirement: !ruby/object:Gem::Requirement
30
44
  requirements:
31
- - - "~>"
45
+ - - ">="
32
46
  - !ruby/object:Gem::Version
33
- version: '10.0'
47
+ version: 12.3.3
34
48
  type: :development
35
49
  prerelease: false
36
50
  version_requirements: !ruby/object:Gem::Requirement
37
51
  requirements:
38
- - - "~>"
52
+ - - ">="
39
53
  - !ruby/object:Gem::Version
40
- version: '10.0'
54
+ version: 12.3.3
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: minitest
43
57
  requirement: !ruby/object:Gem::Requirement
@@ -89,7 +103,7 @@ homepage: https://github.com/boazsegev/combine_pdf
89
103
  licenses:
90
104
  - MIT
91
105
  metadata: {}
92
- post_install_message:
106
+ post_install_message:
93
107
  rdoc_options: []
94
108
  require_paths:
95
109
  - lib
@@ -104,8 +118,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
104
118
  - !ruby/object:Gem::Version
105
119
  version: '0'
106
120
  requirements: []
107
- rubygems_version: 3.0.1
108
- signing_key:
121
+ rubygems_version: 3.2.3
122
+ signing_key:
109
123
  specification_version: 4
110
124
  summary: Combine, stamp and watermark PDF files in pure Ruby.
111
125
  test_files: