pdf-forms 1.0.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 74a388e3ee6551d601f4d713863ff18d8d71e7bc
4
- data.tar.gz: 28c603be2710e14695777208fa1b2219422c67ca
2
+ SHA256:
3
+ metadata.gz: 40651e7dd3bdc30c69263f76ecc3883c8be80f985e431509ee51e9d51bc0aa8d
4
+ data.tar.gz: 7e0ab952ddd42be27d4a58de7acdda3d7ef54d1588e889432ec3be922ac1b722
5
5
  SHA512:
6
- metadata.gz: e7ba4e9bca7cf23c37e431a646c766486a748a9836fb7d3e432ea7ecf814523d1932276f81159c02af298562fdf2653933c2a3c6be16a8d54496e9469d29238b
7
- data.tar.gz: b2e6c9002bcc6999c7b33ccf47eb82c22d3699bb709f6085f2b15b23dd0d4e94ad18e34eee14851f4cf0af5ddd031e779445bb6e1c57d207fd055ea12e64df1f
6
+ metadata.gz: d12c30fbdeabb1443f33d6f52af39e442241ce44158f9095ca909d3de434caef47df83156ba05ae08169d6bed5840220fb8d80dda673c441c3e414fd64ca3a8e
7
+ data.tar.gz: 9a7c21dc930ffe1249258f730cd628bf77a115fd64f84fef5df994fdef5e88e3be227d09e0cdcb862550a8ede7f25afb7fdb73083297f9c0958e998b748f5b6b
data/README.md CHANGED
@@ -11,11 +11,35 @@ Fill out PDF forms with [pdftk](http://www.pdflabs.com/tools/pdftk-server/).
11
11
  You'll need a working `pdftk` binary. Either get a binary package from
12
12
  http://www.pdflabs.com/tools/pdftk-server/ and install it, or run
13
13
  `apt-get install pdftk` if you're on Debian or similar.
14
- [Homebrew Cask](http://caskroom.io) also has a pftk formula.
15
14
 
16
15
  After that, add `pdf-forms` to your Gemfile or manually install the gem. Nothing
17
16
  unusual here.
18
17
 
18
+ ### Using the Java port of PDFTK
19
+
20
+ The PDFTK package was dropped from most (all?) current versions of major Linux distributions.
21
+ As contributed in [this issue](https://github.com/jkraemer/pdf-forms/issues/75#issuecomment-698436643), you can use the [Java version of PDFTK](https://gitlab.com/pdftk-java/pdftk)
22
+ with this gem, as well. Just create a small shell script:
23
+
24
+ ~~~shell
25
+ #!/bin/sh
26
+ MYSELF=`which "$0" 2>/dev/null`
27
+ [ $? -gt 0 -a -f "$0" ] && MYSELF="./$0"
28
+ java=java
29
+ if test -n "$JAVA_HOME"; then
30
+ java="$JAVA_HOME/bin/java"
31
+ fi
32
+ exec "$java" $java_args -jar $MYSELF "$@"
33
+ exit 1
34
+ ~~~
35
+
36
+ Next, concatenate the wrapper script and the Jar file, and you end up with an executable that
37
+ can be used with pdf-forms:
38
+
39
+ ~~~
40
+ cat stub.sh pdftk-all.jar > pdftk.run && chmod +x pdftk.run
41
+ ~~~
42
+
19
43
 
20
44
  ## Usage
21
45
 
@@ -41,6 +65,9 @@ require 'pdf_forms'
41
65
  # add :data_format => 'XFdf' option to generate XFDF instead of FDF when
42
66
  # filling a form (XFDF is supposed to have better support for non-western
43
67
  # encodings)
68
+ # add :data_format => 'FdfHex' option to generate FDF with values passed in
69
+ # UTF16 hexadecimal format (Hexadecimal format has also proven more reliable
70
+ # for passing latin accented characters to pdftk)
44
71
  # add :utf8_fields => true in order to get UTF8 encoded field metadata (this
45
72
  # will use dump_data_fields_utf8 instead of dump_data_fields in the call to
46
73
  # pdftk)
@@ -52,15 +79,42 @@ pdftk.get_field_names 'path/to/form.pdf'
52
79
  # take form.pdf, set the 'foo' field to 'bar' and save the document to myform.pdf
53
80
  pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', :foo => 'bar'
54
81
 
55
- # optionally, add the :flatten option to prevent editing of a filled out form
82
+ # optionally, add the :flatten option to prevent editing of a filled out form.
83
+ # Other supported options are :drop_xfa and :drop_xmp.
56
84
  pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {:foo => 'bar'}, :flatten => true
85
+
86
+ # to enable PDF encryption, pass encrypt: true. By default, a random 'owner
87
+ # password' will be used, but you can also set one with the :encrypt_pw option.
88
+ pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {foo: 'bar'}, encrypt: true, encrypt_options: 'allow printing'
89
+
90
+ # you can also protect the PDF even from opening by specifying an additional user_pw option:
91
+ pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {foo: 'bar'}, encrypt: true, encrypt_options: 'user_pw secret'
57
92
  ```
58
93
 
94
+ Any options shown above can also be set when initializing the PdfForms
95
+ instance. In this case, options given to `fill_form` will override the global
96
+ options.
97
+
98
+ ### Field names with HTML entities
99
+
100
+ In case your form's field names contain HTML entities (like
101
+ `Straße Hausnummer`), make sure you unescape those before using them, i.e.
102
+ `CGI.unescapeHTML(name)`. Thanks to @phoet for figuring this out in #65.
103
+
104
+ ### Non-ASCII Characters (UTF8 etc) are not displayed in the filled out PDF
105
+
106
+ First, check if the field value has been stored properly in the output PDF using `pdftk output.pdf dump_data_fields_utf8`.
107
+
108
+ If it has been stored but is not rendered, your input PDF lacks the proper font for your kind of characters. Re-create it and embed any necessary fonts.
109
+ If the value has not been stored, there is a problem with filling out the form, either on your side, of with this gem.
110
+
111
+ Also see [UTF-8 chars are not displayed in the filled PDF](https://github.com/jkraemer/pdf-forms/issues/53)
112
+
59
113
  ### Prior Art
60
114
 
61
115
  The FDF generation part is a straight port of Steffen Schwigon's PDF::FDF::Simple perl module. Didn't port the FDF parsing, though ;-)
62
116
 
63
117
  ## License
64
118
 
65
- Created by [Jens Kraemer](http://jkraemer.net/) and licensed under the MIT Liense.
119
+ Created by [Jens Kraemer](http://jkraemer.net/) and licensed under the MIT License.
66
120
 
@@ -5,6 +5,7 @@ require 'pdf_forms/normalize_path'
5
5
  require 'pdf_forms/data_format'
6
6
  require 'pdf_forms/fdf'
7
7
  require 'pdf_forms/xfdf'
8
+ require 'pdf_forms/fdf_hex'
8
9
  require 'pdf_forms/pdf'
9
10
  require 'pdf_forms/pdftk_wrapper'
10
11
 
@@ -28,6 +28,7 @@ module PdfForms
28
28
  end
29
29
  end
30
30
 
31
+ # pp 559 https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf
31
32
  def header
32
33
  header = "%FDF-1.2\n\n1 0 obj\n<<\n/FDF << /Fields 2 0 R"
33
34
 
@@ -39,13 +40,16 @@ module PdfForms
39
40
  header << "/ID[" << options[:id].join << "]" if options[:id]
40
41
 
41
42
  header << ">>\n>>\nendobj\n2 0 obj\n["
42
- return header
43
+ header
43
44
  end
44
45
 
46
+ # pp 561 https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf
45
47
  def field(key, value)
46
- "<</T(#{key})/V" +
47
- (Array === value ? "[#{value.map{ |v|"(#{quote(v)})" }.join}]" : "(#{quote(value)})") +
48
- ">>\n"
48
+ field = "<<"
49
+ field << "/T" + "(#{key})"
50
+ field << "/V" + (Array === value ? "[#{value.map{ |v|"(#{quote(v)})" }.join}]" : "(#{quote(value)})")
51
+ field << ">>\n"
52
+ field
49
53
  end
50
54
 
51
55
  def quote(value)
@@ -0,0 +1,39 @@
1
+ # coding: UTF-8
2
+
3
+ module PdfForms
4
+ # Map keys and values to Adobe's FDF format.
5
+ #
6
+ # This is a variation of the original Fdf data format, values are encoded in UTF16 hexadesimal
7
+ # notation to improve compatibility with non ascii charsets.
8
+ #
9
+ # Information about hexadesimal FDF values was found here:
10
+ #
11
+ # http://stackoverflow.com/questions/6047970/weird-characters-when-filling-pdf-with-pdftk
12
+ #
13
+ class FdfHex < Fdf
14
+ private
15
+
16
+ def field(key, value)
17
+ "<</T(#{key})/V" +
18
+ (Array === value ? encode_many(value) : encode_value_as_hex(value)) +
19
+ ">>\n"
20
+ end
21
+
22
+ def encode_many(values)
23
+ "[#{values.map { |v| encode_value_as_hex(v) }.join}]"
24
+ end
25
+
26
+ def encode_value_as_hex(value)
27
+ value = value.to_s
28
+ utf_16 = value.encode('UTF-16BE', :invalid => :replace, :undef => :replace)
29
+ hex = utf_16.unpack('H*').first
30
+ hex.force_encoding 'ASCII-8BIT' # jruby
31
+ '<FEFF' + hex.upcase + '>'
32
+ end
33
+
34
+ # Fdf implementation encodes to ISO-8859-15 which we do not want here.
35
+ def encode_data(fdf)
36
+ fdf
37
+ end
38
+ end
39
+ end
@@ -9,35 +9,45 @@ module PdfForms
9
9
  # FieldStateOption: Ja
10
10
  # FieldStateOption: Off
11
11
  #
12
- # Represenation of a PDF Form Field
12
+ # Representation of a PDF Form Field
13
13
  def initialize(field_description)
14
+ last_value = nil
14
15
  field_description.each_line do |line|
15
- case line
16
- when /FieldStateOption:\s*(.*?)\s*$/
17
- (@options ||= []) << $1
18
- else
19
- line.strip!
20
- key, value = line.split(": ")
21
- key.gsub!(/Field/, "")
22
- key = key.split(/(?=[A-Z])/).map(&:downcase).join('_').split(":")[0]
23
-
24
- instance_variable_set("@#{key}", value)
25
-
26
- # dynamically add in fields that we didn't anticipate in ATTRS
27
- unless self.respond_to?(key.to_sym)
28
- proc = Proc.new { instance_variable_get("@#{key}".to_sym) }
29
- self.class.send(:define_method, key.to_sym, proc)
16
+ line.chomp!
17
+
18
+ if line =~ /^Field([A-Za-z]+):\s+(.*)/
19
+ _, key, value = *$~
20
+
21
+ if key == 'StateOption'
22
+ (@options ||= []) << value
23
+
24
+ else
25
+ value.chomp!
26
+ last_value = value
27
+ key = key.split(/(?=[A-Z])/).map(&:downcase).join('_')
28
+ instance_variable_set("@#{key}", value)
29
+
30
+ # dynamically add in fields that we didn't anticipate in ATTRS
31
+ unless self.respond_to?(key.to_sym)
32
+ proc = Proc.new { instance_variable_get("@#{key}".to_sym) }
33
+ self.class.send(:define_method, key.to_sym, proc)
34
+ end
30
35
  end
36
+
37
+ else
38
+ # pdftk returns a line that doesn't start with "Field"
39
+ # It happens when a text field has multiple lines
40
+ last_value << "\n" << line
31
41
  end
32
42
  end
33
43
  end
34
-
44
+
35
45
  def to_hash
36
46
  hash = {}
37
47
  ATTRS.each do |attribute|
38
48
  hash[attribute] = self.send(attribute)
39
49
  end
40
-
50
+
41
51
  hash
42
52
  end
43
53
 
@@ -1,12 +1,13 @@
1
1
  # coding: UTF-8
2
2
 
3
+ require 'pathname'
4
+
3
5
  module PdfForms
4
6
 
5
7
  module NormalizePath
6
8
 
7
9
  def normalize_path(path)
8
- path = path.to_path if path.respond_to? :to_path
9
- path.to_str
10
+ Pathname(path).to_path
10
11
  end
11
12
 
12
13
  end
@@ -3,6 +3,7 @@
3
3
  require 'tempfile'
4
4
  require 'cliver'
5
5
  require 'safe_shell'
6
+ require 'securerandom'
6
7
 
7
8
  module PdfForms
8
9
  class PdftkError < StandardError
@@ -22,6 +23,9 @@ module PdfForms
22
23
  #
23
24
  # The pdftk binary may also be explecitly specified:
24
25
  # PdftkWrapper.new('/usr/bin/pdftk', :flatten => true, :encrypt => true, :encrypt_options => 'allow Printing')
26
+ #
27
+ # Besides the options shown above, the drop_xfa or drop_xmp options are
28
+ # also supported.
25
29
  def initialize(*args)
26
30
  pdftk, options = normalize_args *args
27
31
  @pdftk = Cliver.detect! pdftk
@@ -40,8 +44,7 @@ module PdfForms
40
44
  fill_options = {:tmp_path => tmp.path}.merge(fill_options)
41
45
 
42
46
  args = [ q_template, 'fill_form', normalize_path(tmp.path), 'output', q_destination ]
43
- append_options args, fill_options
44
- result = call_pdftk *args
47
+ result = call_pdftk(*(append_options(args, fill_options)))
45
48
 
46
49
  unless File.readable?(destination) && File.size(destination) > 0
47
50
  fdf_path = nil
@@ -85,13 +88,28 @@ module PdfForms
85
88
  SafeShell.execute pdftk, *(args.flatten)
86
89
  end
87
90
 
88
- # concatenate documents
91
+ # concatenate documents, can optionally specify page ranges
89
92
  #
90
- # args: in_file1, in_file2, ... , in_file_n, output
93
+ # args: in_file1, {in_file2 => ["1-2", "4-10"]}, ... , in_file_n, output
91
94
  def cat(*args)
92
- arguments = args.flatten.compact.map{|path| normalize_path(path)}
93
- output = arguments.pop
94
- call_pdftk arguments, 'output', output
95
+ in_files = []
96
+ page_ranges = []
97
+ file_handle = "A"
98
+ output = normalize_path args.pop
99
+
100
+ args.flatten.compact.each do |in_file|
101
+ if in_file.is_a? Hash
102
+ path = in_file.keys.first
103
+ page_ranges.push *in_file.values.first.map {|range| "#{file_handle}#{range}"}
104
+ else
105
+ path = in_file
106
+ page_ranges.push "#{file_handle}"
107
+ end
108
+ in_files.push "#{file_handle}=#{normalize_path(path)}"
109
+ file_handle.next!
110
+ end
111
+
112
+ call_pdftk in_files, 'cat', page_ranges, 'output', output
95
113
  end
96
114
 
97
115
  # stamp one pdf with another
@@ -118,17 +136,30 @@ module PdfForms
118
136
  local[attrib] || options[attrib]
119
137
  end
120
138
 
139
+ ALLOWED_OPTIONS = %i(
140
+ drop_xmp
141
+ drop_xfa
142
+ flatten
143
+ need_appearances
144
+ ).freeze
145
+
121
146
  def append_options(args, local_options = {})
122
- return if options.empty? && local_options.empty?
123
- if option_or_global(:flatten, local_options)
124
- args << 'flatten'
147
+ return args if options.empty? && local_options.empty?
148
+ args = args.dup
149
+ ALLOWED_OPTIONS.each do |option|
150
+ if option_or_global(option, local_options)
151
+ args << option.to_s
152
+ end
125
153
  end
126
154
  if option_or_global(:encrypt, local_options)
127
155
  encrypt_pass = option_or_global(:encrypt_password, local_options)
128
- encrypt_pass ||= option_or_global(:tmp_path, local_options)
156
+ encrypt_pass ||= SecureRandom.urlsafe_base64
129
157
  encrypt_options = option_or_global(:encrypt_options, local_options)
130
- args += ['encrypt_128bit', 'owner_pw', encrypt_pass, encrypt_options].flatten.compact
158
+ encrypt_options = encrypt_options.split if String === encrypt_options
159
+ args << ['encrypt_128bit', 'owner_pw', encrypt_pass, encrypt_options]
131
160
  end
161
+ args.flatten!
162
+ args.compact!
132
163
  args
133
164
  end
134
165
 
@@ -141,11 +172,11 @@ module PdfForms
141
172
  when String
142
173
  pdftk = first_arg
143
174
  options = args.shift || {}
144
- raise InvalidArgumentError.new("expected hash, got #{options.class.name}") unless Hash === options
175
+ raise ArgumentError.new("expected hash, got #{options.class.name}") unless Hash === options
145
176
  when Hash
146
177
  options = first_arg
147
178
  else
148
- raise InvalidArgumentError.new("expected string or hash, got #{first_arg.class.name}")
179
+ raise ArgumentError.new("expected string or hash, got #{first_arg.class.name}")
149
180
  end
150
181
  end
151
182
  [pdftk, options]
@@ -1,5 +1,5 @@
1
1
  # coding: UTF-8
2
2
 
3
3
  module PdfForms
4
- VERSION = '1.0.0'
4
+ VERSION = '1.3.0'
5
5
  end
@@ -1,5 +1,7 @@
1
1
  # coding: UTF-8
2
2
 
3
+ require 'rexml/document'
4
+
3
5
  module PdfForms
4
6
  # Map keys and values to Adobe's XFDF format.
5
7
  class XFdf < DataFormat
@@ -14,7 +16,6 @@ module PdfForms
14
16
  end
15
17
 
16
18
  def quote(value)
17
- require 'rexml/document'
18
19
  REXML::Text.new(value.to_s).to_s
19
20
  end
20
21
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pdf-forms
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jens Krämer
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-10-30 00:00:00.000000000 Z
11
+ date: 2020-10-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: cliver
@@ -48,14 +48,14 @@ dependencies:
48
48
  name: bundler
49
49
  requirement: !ruby/object:Gem::Requirement
50
50
  requirements:
51
- - - ">="
51
+ - - "~>"
52
52
  - !ruby/object:Gem::Version
53
53
  version: '1.7'
54
54
  type: :development
55
55
  prerelease: false
56
56
  version_requirements: !ruby/object:Gem::Requirement
57
57
  requirements:
58
- - - ">="
58
+ - - "~>"
59
59
  - !ruby/object:Gem::Version
60
60
  version: '1.7'
61
61
  - !ruby/object:Gem::Dependency
@@ -86,6 +86,7 @@ files:
86
86
  - lib/pdf_forms.rb
87
87
  - lib/pdf_forms/data_format.rb
88
88
  - lib/pdf_forms/fdf.rb
89
+ - lib/pdf_forms/fdf_hex.rb
89
90
  - lib/pdf_forms/field.rb
90
91
  - lib/pdf_forms/normalize_path.rb
91
92
  - lib/pdf_forms/pdf.rb
@@ -112,9 +113,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
112
113
  version: 1.3.6
113
114
  requirements: []
114
115
  rubyforge_project: pdf-forms
115
- rubygems_version: 2.4.5.1
116
+ rubygems_version: 2.7.6.2
116
117
  signing_key:
117
118
  specification_version: 4
118
119
  summary: Fill out PDF forms with pdftk (http://www.accesspdf.com/pdftk/).
119
120
  test_files: []
120
- has_rdoc: