pdf-forms 1.0.0 → 1.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/README.md +57 -3
- data/lib/pdf_forms.rb +1 -0
- data/lib/pdf_forms/fdf.rb +8 -4
- data/lib/pdf_forms/fdf_hex.rb +39 -0
- data/lib/pdf_forms/field.rb +28 -18
- data/lib/pdf_forms/normalize_path.rb +3 -2
- data/lib/pdf_forms/pdftk_wrapper.rb +45 -14
- data/lib/pdf_forms/version.rb +1 -1
- data/lib/pdf_forms/xfdf.rb +2 -1
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 40651e7dd3bdc30c69263f76ecc3883c8be80f985e431509ee51e9d51bc0aa8d
|
4
|
+
data.tar.gz: 7e0ab952ddd42be27d4a58de7acdda3d7ef54d1588e889432ec3be922ac1b722
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d12c30fbdeabb1443f33d6f52af39e442241ce44158f9095ca909d3de434caef47df83156ba05ae08169d6bed5840220fb8d80dda673c441c3e414fd64ca3a8e
|
7
|
+
data.tar.gz: 9a7c21dc930ffe1249258f730cd628bf77a115fd64f84fef5df994fdef5e88e3be227d09e0cdcb862550a8ede7f25afb7fdb73083297f9c0958e998b748f5b6b
|
data/README.md
CHANGED
@@ -11,11 +11,35 @@ Fill out PDF forms with [pdftk](http://www.pdflabs.com/tools/pdftk-server/).
|
|
11
11
|
You'll need a working `pdftk` binary. Either get a binary package from
|
12
12
|
http://www.pdflabs.com/tools/pdftk-server/ and install it, or run
|
13
13
|
`apt-get install pdftk` if you're on Debian or similar.
|
14
|
-
[Homebrew Cask](http://caskroom.io) also has a pftk formula.
|
15
14
|
|
16
15
|
After that, add `pdf-forms` to your Gemfile or manually install the gem. Nothing
|
17
16
|
unusual here.
|
18
17
|
|
18
|
+
### Using the Java port of PDFTK
|
19
|
+
|
20
|
+
The PDFTK package was dropped from most (all?) current versions of major Linux distributions.
|
21
|
+
As contributed in [this issue](https://github.com/jkraemer/pdf-forms/issues/75#issuecomment-698436643), you can use the [Java version of PDFTK](https://gitlab.com/pdftk-java/pdftk)
|
22
|
+
with this gem, as well. Just create a small shell script:
|
23
|
+
|
24
|
+
~~~shell
|
25
|
+
#!/bin/sh
|
26
|
+
MYSELF=`which "$0" 2>/dev/null`
|
27
|
+
[ $? -gt 0 -a -f "$0" ] && MYSELF="./$0"
|
28
|
+
java=java
|
29
|
+
if test -n "$JAVA_HOME"; then
|
30
|
+
java="$JAVA_HOME/bin/java"
|
31
|
+
fi
|
32
|
+
exec "$java" $java_args -jar $MYSELF "$@"
|
33
|
+
exit 1
|
34
|
+
~~~
|
35
|
+
|
36
|
+
Next, concatenate the wrapper script and the Jar file, and you end up with an executable that
|
37
|
+
can be used with pdf-forms:
|
38
|
+
|
39
|
+
~~~
|
40
|
+
cat stub.sh pdftk-all.jar > pdftk.run && chmod +x pdftk.run
|
41
|
+
~~~
|
42
|
+
|
19
43
|
|
20
44
|
## Usage
|
21
45
|
|
@@ -41,6 +65,9 @@ require 'pdf_forms'
|
|
41
65
|
# add :data_format => 'XFdf' option to generate XFDF instead of FDF when
|
42
66
|
# filling a form (XFDF is supposed to have better support for non-western
|
43
67
|
# encodings)
|
68
|
+
# add :data_format => 'FdfHex' option to generate FDF with values passed in
|
69
|
+
# UTF16 hexadecimal format (Hexadecimal format has also proven more reliable
|
70
|
+
# for passing latin accented characters to pdftk)
|
44
71
|
# add :utf8_fields => true in order to get UTF8 encoded field metadata (this
|
45
72
|
# will use dump_data_fields_utf8 instead of dump_data_fields in the call to
|
46
73
|
# pdftk)
|
@@ -52,15 +79,42 @@ pdftk.get_field_names 'path/to/form.pdf'
|
|
52
79
|
# take form.pdf, set the 'foo' field to 'bar' and save the document to myform.pdf
|
53
80
|
pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', :foo => 'bar'
|
54
81
|
|
55
|
-
# optionally, add the :flatten option to prevent editing of a filled out form
|
82
|
+
# optionally, add the :flatten option to prevent editing of a filled out form.
|
83
|
+
# Other supported options are :drop_xfa and :drop_xmp.
|
56
84
|
pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {:foo => 'bar'}, :flatten => true
|
85
|
+
|
86
|
+
# to enable PDF encryption, pass encrypt: true. By default, a random 'owner
|
87
|
+
# password' will be used, but you can also set one with the :encrypt_pw option.
|
88
|
+
pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {foo: 'bar'}, encrypt: true, encrypt_options: 'allow printing'
|
89
|
+
|
90
|
+
# you can also protect the PDF even from opening by specifying an additional user_pw option:
|
91
|
+
pdftk.fill_form '/path/to/form.pdf', 'myform.pdf', {foo: 'bar'}, encrypt: true, encrypt_options: 'user_pw secret'
|
57
92
|
```
|
58
93
|
|
94
|
+
Any options shown above can also be set when initializing the PdfForms
|
95
|
+
instance. In this case, options given to `fill_form` will override the global
|
96
|
+
options.
|
97
|
+
|
98
|
+
### Field names with HTML entities
|
99
|
+
|
100
|
+
In case your form's field names contain HTML entities (like
|
101
|
+
`Straße Hausnummer`), make sure you unescape those before using them, i.e.
|
102
|
+
`CGI.unescapeHTML(name)`. Thanks to @phoet for figuring this out in #65.
|
103
|
+
|
104
|
+
### Non-ASCII Characters (UTF8 etc) are not displayed in the filled out PDF
|
105
|
+
|
106
|
+
First, check if the field value has been stored properly in the output PDF using `pdftk output.pdf dump_data_fields_utf8`.
|
107
|
+
|
108
|
+
If it has been stored but is not rendered, your input PDF lacks the proper font for your kind of characters. Re-create it and embed any necessary fonts.
|
109
|
+
If the value has not been stored, there is a problem with filling out the form, either on your side, of with this gem.
|
110
|
+
|
111
|
+
Also see [UTF-8 chars are not displayed in the filled PDF](https://github.com/jkraemer/pdf-forms/issues/53)
|
112
|
+
|
59
113
|
### Prior Art
|
60
114
|
|
61
115
|
The FDF generation part is a straight port of Steffen Schwigon's PDF::FDF::Simple perl module. Didn't port the FDF parsing, though ;-)
|
62
116
|
|
63
117
|
## License
|
64
118
|
|
65
|
-
Created by [Jens Kraemer](http://jkraemer.net/) and licensed under the MIT
|
119
|
+
Created by [Jens Kraemer](http://jkraemer.net/) and licensed under the MIT License.
|
66
120
|
|
data/lib/pdf_forms.rb
CHANGED
data/lib/pdf_forms/fdf.rb
CHANGED
@@ -28,6 +28,7 @@ module PdfForms
|
|
28
28
|
end
|
29
29
|
end
|
30
30
|
|
31
|
+
# pp 559 https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf
|
31
32
|
def header
|
32
33
|
header = "%FDF-1.2\n\n1 0 obj\n<<\n/FDF << /Fields 2 0 R"
|
33
34
|
|
@@ -39,13 +40,16 @@ module PdfForms
|
|
39
40
|
header << "/ID[" << options[:id].join << "]" if options[:id]
|
40
41
|
|
41
42
|
header << ">>\n>>\nendobj\n2 0 obj\n["
|
42
|
-
|
43
|
+
header
|
43
44
|
end
|
44
45
|
|
46
|
+
# pp 561 https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf
|
45
47
|
def field(key, value)
|
46
|
-
"
|
47
|
-
|
48
|
-
|
48
|
+
field = "<<"
|
49
|
+
field << "/T" + "(#{key})"
|
50
|
+
field << "/V" + (Array === value ? "[#{value.map{ |v|"(#{quote(v)})" }.join}]" : "(#{quote(value)})")
|
51
|
+
field << ">>\n"
|
52
|
+
field
|
49
53
|
end
|
50
54
|
|
51
55
|
def quote(value)
|
@@ -0,0 +1,39 @@
|
|
1
|
+
# coding: UTF-8
|
2
|
+
|
3
|
+
module PdfForms
|
4
|
+
# Map keys and values to Adobe's FDF format.
|
5
|
+
#
|
6
|
+
# This is a variation of the original Fdf data format, values are encoded in UTF16 hexadesimal
|
7
|
+
# notation to improve compatibility with non ascii charsets.
|
8
|
+
#
|
9
|
+
# Information about hexadesimal FDF values was found here:
|
10
|
+
#
|
11
|
+
# http://stackoverflow.com/questions/6047970/weird-characters-when-filling-pdf-with-pdftk
|
12
|
+
#
|
13
|
+
class FdfHex < Fdf
|
14
|
+
private
|
15
|
+
|
16
|
+
def field(key, value)
|
17
|
+
"<</T(#{key})/V" +
|
18
|
+
(Array === value ? encode_many(value) : encode_value_as_hex(value)) +
|
19
|
+
">>\n"
|
20
|
+
end
|
21
|
+
|
22
|
+
def encode_many(values)
|
23
|
+
"[#{values.map { |v| encode_value_as_hex(v) }.join}]"
|
24
|
+
end
|
25
|
+
|
26
|
+
def encode_value_as_hex(value)
|
27
|
+
value = value.to_s
|
28
|
+
utf_16 = value.encode('UTF-16BE', :invalid => :replace, :undef => :replace)
|
29
|
+
hex = utf_16.unpack('H*').first
|
30
|
+
hex.force_encoding 'ASCII-8BIT' # jruby
|
31
|
+
'<FEFF' + hex.upcase + '>'
|
32
|
+
end
|
33
|
+
|
34
|
+
# Fdf implementation encodes to ISO-8859-15 which we do not want here.
|
35
|
+
def encode_data(fdf)
|
36
|
+
fdf
|
37
|
+
end
|
38
|
+
end
|
39
|
+
end
|
data/lib/pdf_forms/field.rb
CHANGED
@@ -9,35 +9,45 @@ module PdfForms
|
|
9
9
|
# FieldStateOption: Ja
|
10
10
|
# FieldStateOption: Off
|
11
11
|
#
|
12
|
-
#
|
12
|
+
# Representation of a PDF Form Field
|
13
13
|
def initialize(field_description)
|
14
|
+
last_value = nil
|
14
15
|
field_description.each_line do |line|
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
key
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
16
|
+
line.chomp!
|
17
|
+
|
18
|
+
if line =~ /^Field([A-Za-z]+):\s+(.*)/
|
19
|
+
_, key, value = *$~
|
20
|
+
|
21
|
+
if key == 'StateOption'
|
22
|
+
(@options ||= []) << value
|
23
|
+
|
24
|
+
else
|
25
|
+
value.chomp!
|
26
|
+
last_value = value
|
27
|
+
key = key.split(/(?=[A-Z])/).map(&:downcase).join('_')
|
28
|
+
instance_variable_set("@#{key}", value)
|
29
|
+
|
30
|
+
# dynamically add in fields that we didn't anticipate in ATTRS
|
31
|
+
unless self.respond_to?(key.to_sym)
|
32
|
+
proc = Proc.new { instance_variable_get("@#{key}".to_sym) }
|
33
|
+
self.class.send(:define_method, key.to_sym, proc)
|
34
|
+
end
|
30
35
|
end
|
36
|
+
|
37
|
+
else
|
38
|
+
# pdftk returns a line that doesn't start with "Field"
|
39
|
+
# It happens when a text field has multiple lines
|
40
|
+
last_value << "\n" << line
|
31
41
|
end
|
32
42
|
end
|
33
43
|
end
|
34
|
-
|
44
|
+
|
35
45
|
def to_hash
|
36
46
|
hash = {}
|
37
47
|
ATTRS.each do |attribute|
|
38
48
|
hash[attribute] = self.send(attribute)
|
39
49
|
end
|
40
|
-
|
50
|
+
|
41
51
|
hash
|
42
52
|
end
|
43
53
|
|
@@ -3,6 +3,7 @@
|
|
3
3
|
require 'tempfile'
|
4
4
|
require 'cliver'
|
5
5
|
require 'safe_shell'
|
6
|
+
require 'securerandom'
|
6
7
|
|
7
8
|
module PdfForms
|
8
9
|
class PdftkError < StandardError
|
@@ -22,6 +23,9 @@ module PdfForms
|
|
22
23
|
#
|
23
24
|
# The pdftk binary may also be explecitly specified:
|
24
25
|
# PdftkWrapper.new('/usr/bin/pdftk', :flatten => true, :encrypt => true, :encrypt_options => 'allow Printing')
|
26
|
+
#
|
27
|
+
# Besides the options shown above, the drop_xfa or drop_xmp options are
|
28
|
+
# also supported.
|
25
29
|
def initialize(*args)
|
26
30
|
pdftk, options = normalize_args *args
|
27
31
|
@pdftk = Cliver.detect! pdftk
|
@@ -40,8 +44,7 @@ module PdfForms
|
|
40
44
|
fill_options = {:tmp_path => tmp.path}.merge(fill_options)
|
41
45
|
|
42
46
|
args = [ q_template, 'fill_form', normalize_path(tmp.path), 'output', q_destination ]
|
43
|
-
append_options
|
44
|
-
result = call_pdftk *args
|
47
|
+
result = call_pdftk(*(append_options(args, fill_options)))
|
45
48
|
|
46
49
|
unless File.readable?(destination) && File.size(destination) > 0
|
47
50
|
fdf_path = nil
|
@@ -85,13 +88,28 @@ module PdfForms
|
|
85
88
|
SafeShell.execute pdftk, *(args.flatten)
|
86
89
|
end
|
87
90
|
|
88
|
-
# concatenate documents
|
91
|
+
# concatenate documents, can optionally specify page ranges
|
89
92
|
#
|
90
|
-
# args: in_file1, in_file2, ... , in_file_n, output
|
93
|
+
# args: in_file1, {in_file2 => ["1-2", "4-10"]}, ... , in_file_n, output
|
91
94
|
def cat(*args)
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
+
in_files = []
|
96
|
+
page_ranges = []
|
97
|
+
file_handle = "A"
|
98
|
+
output = normalize_path args.pop
|
99
|
+
|
100
|
+
args.flatten.compact.each do |in_file|
|
101
|
+
if in_file.is_a? Hash
|
102
|
+
path = in_file.keys.first
|
103
|
+
page_ranges.push *in_file.values.first.map {|range| "#{file_handle}#{range}"}
|
104
|
+
else
|
105
|
+
path = in_file
|
106
|
+
page_ranges.push "#{file_handle}"
|
107
|
+
end
|
108
|
+
in_files.push "#{file_handle}=#{normalize_path(path)}"
|
109
|
+
file_handle.next!
|
110
|
+
end
|
111
|
+
|
112
|
+
call_pdftk in_files, 'cat', page_ranges, 'output', output
|
95
113
|
end
|
96
114
|
|
97
115
|
# stamp one pdf with another
|
@@ -118,17 +136,30 @@ module PdfForms
|
|
118
136
|
local[attrib] || options[attrib]
|
119
137
|
end
|
120
138
|
|
139
|
+
ALLOWED_OPTIONS = %i(
|
140
|
+
drop_xmp
|
141
|
+
drop_xfa
|
142
|
+
flatten
|
143
|
+
need_appearances
|
144
|
+
).freeze
|
145
|
+
|
121
146
|
def append_options(args, local_options = {})
|
122
|
-
return if options.empty? && local_options.empty?
|
123
|
-
|
124
|
-
|
147
|
+
return args if options.empty? && local_options.empty?
|
148
|
+
args = args.dup
|
149
|
+
ALLOWED_OPTIONS.each do |option|
|
150
|
+
if option_or_global(option, local_options)
|
151
|
+
args << option.to_s
|
152
|
+
end
|
125
153
|
end
|
126
154
|
if option_or_global(:encrypt, local_options)
|
127
155
|
encrypt_pass = option_or_global(:encrypt_password, local_options)
|
128
|
-
encrypt_pass ||=
|
156
|
+
encrypt_pass ||= SecureRandom.urlsafe_base64
|
129
157
|
encrypt_options = option_or_global(:encrypt_options, local_options)
|
130
|
-
|
158
|
+
encrypt_options = encrypt_options.split if String === encrypt_options
|
159
|
+
args << ['encrypt_128bit', 'owner_pw', encrypt_pass, encrypt_options]
|
131
160
|
end
|
161
|
+
args.flatten!
|
162
|
+
args.compact!
|
132
163
|
args
|
133
164
|
end
|
134
165
|
|
@@ -141,11 +172,11 @@ module PdfForms
|
|
141
172
|
when String
|
142
173
|
pdftk = first_arg
|
143
174
|
options = args.shift || {}
|
144
|
-
raise
|
175
|
+
raise ArgumentError.new("expected hash, got #{options.class.name}") unless Hash === options
|
145
176
|
when Hash
|
146
177
|
options = first_arg
|
147
178
|
else
|
148
|
-
raise
|
179
|
+
raise ArgumentError.new("expected string or hash, got #{first_arg.class.name}")
|
149
180
|
end
|
150
181
|
end
|
151
182
|
[pdftk, options]
|
data/lib/pdf_forms/version.rb
CHANGED
data/lib/pdf_forms/xfdf.rb
CHANGED
@@ -1,5 +1,7 @@
|
|
1
1
|
# coding: UTF-8
|
2
2
|
|
3
|
+
require 'rexml/document'
|
4
|
+
|
3
5
|
module PdfForms
|
4
6
|
# Map keys and values to Adobe's XFDF format.
|
5
7
|
class XFdf < DataFormat
|
@@ -14,7 +16,6 @@ module PdfForms
|
|
14
16
|
end
|
15
17
|
|
16
18
|
def quote(value)
|
17
|
-
require 'rexml/document'
|
18
19
|
REXML::Text.new(value.to_s).to_s
|
19
20
|
end
|
20
21
|
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: pdf-forms
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jens Krämer
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-10-07 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: cliver
|
@@ -48,14 +48,14 @@ dependencies:
|
|
48
48
|
name: bundler
|
49
49
|
requirement: !ruby/object:Gem::Requirement
|
50
50
|
requirements:
|
51
|
-
- - "
|
51
|
+
- - "~>"
|
52
52
|
- !ruby/object:Gem::Version
|
53
53
|
version: '1.7'
|
54
54
|
type: :development
|
55
55
|
prerelease: false
|
56
56
|
version_requirements: !ruby/object:Gem::Requirement
|
57
57
|
requirements:
|
58
|
-
- - "
|
58
|
+
- - "~>"
|
59
59
|
- !ruby/object:Gem::Version
|
60
60
|
version: '1.7'
|
61
61
|
- !ruby/object:Gem::Dependency
|
@@ -86,6 +86,7 @@ files:
|
|
86
86
|
- lib/pdf_forms.rb
|
87
87
|
- lib/pdf_forms/data_format.rb
|
88
88
|
- lib/pdf_forms/fdf.rb
|
89
|
+
- lib/pdf_forms/fdf_hex.rb
|
89
90
|
- lib/pdf_forms/field.rb
|
90
91
|
- lib/pdf_forms/normalize_path.rb
|
91
92
|
- lib/pdf_forms/pdf.rb
|
@@ -112,9 +113,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
112
113
|
version: 1.3.6
|
113
114
|
requirements: []
|
114
115
|
rubyforge_project: pdf-forms
|
115
|
-
rubygems_version: 2.
|
116
|
+
rubygems_version: 2.7.6.2
|
116
117
|
signing_key:
|
117
118
|
specification_version: 4
|
118
119
|
summary: Fill out PDF forms with pdftk (http://www.accesspdf.com/pdftk/).
|
119
120
|
test_files: []
|
120
|
-
has_rdoc:
|