hexapdf 0.2.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (173) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +33 -1
  3. data/CONTRIBUTERS +1 -1
  4. data/LICENSE +1 -1
  5. data/Rakefile +1 -1
  6. data/VERSION +1 -1
  7. data/lib/hexapdf.rb +1 -1
  8. data/lib/hexapdf/cli.rb +19 -52
  9. data/lib/hexapdf/cli/command.rb +251 -0
  10. data/lib/hexapdf/cli/{extract.rb → files.rb} +19 -23
  11. data/lib/hexapdf/cli/images.rb +147 -0
  12. data/lib/hexapdf/cli/info.rb +5 -5
  13. data/lib/hexapdf/cli/inspect.rb +13 -12
  14. data/lib/hexapdf/cli/merge.rb +200 -0
  15. data/lib/hexapdf/cli/modify.rb +39 -242
  16. data/lib/hexapdf/cli/optimize.rb +104 -0
  17. data/lib/hexapdf/configuration.rb +1 -1
  18. data/lib/hexapdf/content.rb +1 -1
  19. data/lib/hexapdf/content/canvas.rb +1 -1
  20. data/lib/hexapdf/content/color_space.rb +1 -1
  21. data/lib/hexapdf/content/graphic_object.rb +1 -1
  22. data/lib/hexapdf/content/graphic_object/arc.rb +1 -1
  23. data/lib/hexapdf/content/graphic_object/endpoint_arc.rb +1 -1
  24. data/lib/hexapdf/content/graphic_object/solid_arc.rb +1 -1
  25. data/lib/hexapdf/content/graphics_state.rb +1 -1
  26. data/lib/hexapdf/content/operator.rb +1 -1
  27. data/lib/hexapdf/content/parser.rb +16 -15
  28. data/lib/hexapdf/content/processor.rb +1 -1
  29. data/lib/hexapdf/content/transformation_matrix.rb +1 -1
  30. data/lib/hexapdf/data_dir.rb +1 -1
  31. data/lib/hexapdf/dictionary.rb +1 -1
  32. data/lib/hexapdf/dictionary_fields.rb +1 -1
  33. data/lib/hexapdf/document.rb +1 -1
  34. data/lib/hexapdf/document/files.rb +1 -1
  35. data/lib/hexapdf/document/fonts.rb +1 -1
  36. data/lib/hexapdf/document/images.rb +1 -1
  37. data/lib/hexapdf/document/pages.rb +1 -1
  38. data/lib/hexapdf/encryption.rb +1 -1
  39. data/lib/hexapdf/encryption/aes.rb +1 -1
  40. data/lib/hexapdf/encryption/arc4.rb +1 -1
  41. data/lib/hexapdf/encryption/fast_aes.rb +1 -1
  42. data/lib/hexapdf/encryption/fast_arc4.rb +1 -1
  43. data/lib/hexapdf/encryption/identity.rb +1 -1
  44. data/lib/hexapdf/encryption/ruby_aes.rb +1 -1
  45. data/lib/hexapdf/encryption/ruby_arc4.rb +1 -1
  46. data/lib/hexapdf/encryption/security_handler.rb +1 -1
  47. data/lib/hexapdf/encryption/standard_security_handler.rb +1 -1
  48. data/lib/hexapdf/error.rb +1 -1
  49. data/lib/hexapdf/filter.rb +1 -1
  50. data/lib/hexapdf/filter/ascii85_decode.rb +1 -1
  51. data/lib/hexapdf/filter/ascii_hex_decode.rb +1 -1
  52. data/lib/hexapdf/filter/dct_decode.rb +1 -1
  53. data/lib/hexapdf/filter/encryption.rb +1 -1
  54. data/lib/hexapdf/filter/flate_decode.rb +1 -1
  55. data/lib/hexapdf/filter/jpx_decode.rb +1 -1
  56. data/lib/hexapdf/filter/lzw_decode.rb +2 -3
  57. data/lib/hexapdf/filter/predictor.rb +11 -11
  58. data/lib/hexapdf/filter/run_length_decode.rb +1 -1
  59. data/lib/hexapdf/font/cmap.rb +1 -1
  60. data/lib/hexapdf/font/cmap/parser.rb +1 -1
  61. data/lib/hexapdf/font/cmap/writer.rb +1 -1
  62. data/lib/hexapdf/font/encoding.rb +1 -1
  63. data/lib/hexapdf/font/encoding/base.rb +1 -1
  64. data/lib/hexapdf/font/encoding/difference_encoding.rb +1 -1
  65. data/lib/hexapdf/font/encoding/glyph_list.rb +1 -1
  66. data/lib/hexapdf/font/encoding/mac_expert_encoding.rb +1 -1
  67. data/lib/hexapdf/font/encoding/mac_roman_encoding.rb +1 -1
  68. data/lib/hexapdf/font/encoding/standard_encoding.rb +1 -1
  69. data/lib/hexapdf/font/encoding/symbol_encoding.rb +1 -1
  70. data/lib/hexapdf/font/encoding/win_ansi_encoding.rb +1 -1
  71. data/lib/hexapdf/font/encoding/zapf_dingbats_encoding.rb +1 -1
  72. data/lib/hexapdf/font/true_type.rb +2 -1
  73. data/lib/hexapdf/font/true_type/font.rb +1 -1
  74. data/lib/hexapdf/font/true_type/subsetter.rb +186 -0
  75. data/lib/hexapdf/font/true_type/table.rb +8 -4
  76. data/lib/hexapdf/font/true_type/table/cmap.rb +1 -1
  77. data/lib/hexapdf/font/true_type/table/cmap_subtable.rb +1 -1
  78. data/lib/hexapdf/font/true_type/table/directory.rb +1 -1
  79. data/lib/hexapdf/font/true_type/table/glyf.rb +6 -2
  80. data/lib/hexapdf/font/true_type/table/head.rb +2 -2
  81. data/lib/hexapdf/font/true_type/table/hhea.rb +1 -1
  82. data/lib/hexapdf/font/true_type/table/hmtx.rb +1 -1
  83. data/lib/hexapdf/font/true_type/table/loca.rb +1 -1
  84. data/lib/hexapdf/font/true_type/table/maxp.rb +1 -1
  85. data/lib/hexapdf/font/true_type/table/name.rb +1 -1
  86. data/lib/hexapdf/font/true_type/table/os2.rb +1 -1
  87. data/lib/hexapdf/font/true_type/table/post.rb +1 -1
  88. data/lib/hexapdf/font/true_type_wrapper.rb +56 -8
  89. data/lib/hexapdf/font/type1.rb +1 -1
  90. data/lib/hexapdf/font/type1/afm_parser.rb +1 -1
  91. data/lib/hexapdf/font/type1/character_metrics.rb +1 -1
  92. data/lib/hexapdf/font/type1/font.rb +1 -1
  93. data/lib/hexapdf/font/type1/font_metrics.rb +1 -1
  94. data/lib/hexapdf/font/type1/pfb_parser.rb +1 -1
  95. data/lib/hexapdf/font/type1_wrapper.rb +1 -1
  96. data/lib/hexapdf/font_loader.rb +1 -1
  97. data/lib/hexapdf/font_loader/from_configuration.rb +6 -3
  98. data/lib/hexapdf/font_loader/standard14.rb +1 -1
  99. data/lib/hexapdf/image_loader.rb +1 -1
  100. data/lib/hexapdf/image_loader/jpeg.rb +1 -1
  101. data/lib/hexapdf/image_loader/pdf.rb +1 -1
  102. data/lib/hexapdf/image_loader/png.rb +1 -1
  103. data/lib/hexapdf/importer.rb +1 -1
  104. data/lib/hexapdf/name_tree_node.rb +1 -1
  105. data/lib/hexapdf/number_tree_node.rb +1 -1
  106. data/lib/hexapdf/object.rb +1 -1
  107. data/lib/hexapdf/parser.rb +1 -1
  108. data/lib/hexapdf/rectangle.rb +1 -1
  109. data/lib/hexapdf/reference.rb +1 -1
  110. data/lib/hexapdf/revision.rb +1 -1
  111. data/lib/hexapdf/revisions.rb +13 -15
  112. data/lib/hexapdf/serializer.rb +7 -3
  113. data/lib/hexapdf/stream.rb +1 -1
  114. data/lib/hexapdf/task.rb +1 -1
  115. data/lib/hexapdf/task/dereference.rb +1 -1
  116. data/lib/hexapdf/task/optimize.rb +1 -1
  117. data/lib/hexapdf/tokenizer.rb +12 -12
  118. data/lib/hexapdf/type.rb +1 -1
  119. data/lib/hexapdf/type/catalog.rb +1 -1
  120. data/lib/hexapdf/type/embedded_file.rb +1 -1
  121. data/lib/hexapdf/type/file_specification.rb +1 -1
  122. data/lib/hexapdf/type/font.rb +1 -1
  123. data/lib/hexapdf/type/font_descriptor.rb +1 -1
  124. data/lib/hexapdf/type/font_simple.rb +1 -1
  125. data/lib/hexapdf/type/font_true_type.rb +1 -1
  126. data/lib/hexapdf/type/font_type1.rb +1 -1
  127. data/lib/hexapdf/type/form.rb +1 -1
  128. data/lib/hexapdf/type/graphics_state_parameter.rb +1 -1
  129. data/lib/hexapdf/type/image.rb +187 -1
  130. data/lib/hexapdf/type/info.rb +1 -1
  131. data/lib/hexapdf/type/names.rb +1 -1
  132. data/lib/hexapdf/type/object_stream.rb +1 -1
  133. data/lib/hexapdf/type/page.rb +1 -1
  134. data/lib/hexapdf/type/page_tree_node.rb +6 -1
  135. data/lib/hexapdf/type/resources.rb +1 -1
  136. data/lib/hexapdf/type/trailer.rb +2 -2
  137. data/lib/hexapdf/type/viewer_preferences.rb +1 -1
  138. data/lib/hexapdf/type/xref_stream.rb +22 -18
  139. data/lib/hexapdf/utils/bit_field.rb +1 -1
  140. data/lib/hexapdf/utils/bit_stream.rb +16 -32
  141. data/lib/hexapdf/utils/lru_cache.rb +1 -1
  142. data/lib/hexapdf/utils/math_helpers.rb +1 -1
  143. data/lib/hexapdf/utils/object_hash.rb +1 -1
  144. data/lib/hexapdf/utils/pdf_doc_encoding.rb +1 -1
  145. data/lib/hexapdf/utils/sorted_tree_node.rb +1 -1
  146. data/lib/hexapdf/version.rb +2 -2
  147. data/lib/hexapdf/writer.rb +2 -1
  148. data/lib/hexapdf/xref_section.rb +6 -1
  149. data/man/man1/hexapdf.1 +194 -115
  150. data/test/data/images/greyscale-1bit.png +0 -0
  151. data/test/data/images/greyscale-2bit.png +0 -0
  152. data/test/data/images/greyscale-8bit.png +0 -0
  153. data/test/data/images/indexed-alpha-4bit.png +0 -0
  154. data/test/data/images/truecolour-8bit.png +0 -0
  155. data/test/hexapdf/content/test_operator.rb +8 -8
  156. data/test/hexapdf/content/test_processor.rb +1 -1
  157. data/test/hexapdf/encryption/test_security_handler.rb +1 -1
  158. data/test/hexapdf/font/test_true_type_wrapper.rb +89 -48
  159. data/test/hexapdf/font/true_type/table/test_glyf.rb +1 -0
  160. data/test/hexapdf/font/true_type/test_subsetter.rb +70 -0
  161. data/test/hexapdf/font/true_type/test_table.rb +16 -0
  162. data/test/hexapdf/font_loader/test_from_configuration.rb +7 -0
  163. data/test/hexapdf/test_document.rb +1 -1
  164. data/test/hexapdf/test_object.rb +1 -1
  165. data/test/hexapdf/test_revisions.rb +34 -8
  166. data/test/hexapdf/test_serializer.rb +3 -0
  167. data/test/hexapdf/test_writer.rb +11 -2
  168. data/test/hexapdf/test_xref_section.rb +15 -0
  169. data/test/hexapdf/type/test_image.rb +234 -0
  170. data/test/hexapdf/type/test_object_stream.rb +2 -2
  171. data/test/hexapdf/type/test_trailer.rb +4 -0
  172. data/test/hexapdf/utils/test_bit_stream.rb +69 -0
  173. metadata +14 -6
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -115,6 +115,9 @@ module HexaPDF
115
115
  #
116
116
  # Must be called on the root of the page tree, otherwise the /Count entries are not
117
117
  # correctly updated!
118
+ #
119
+ # If an existing page is inserted, it may be necessary to use Page#copy_inherited_values
120
+ # before insertion so that the page dictionary contains all necessary information.
118
121
  def insert_page(index, page = nil)
119
122
  page ||= new_page
120
123
  index = self[:Count] + index + 1 if index < 0
@@ -147,6 +150,8 @@ module HexaPDF
147
150
  end
148
151
 
149
152
  # Adds the page or a new empty page at the end and returns it.
153
+ #
154
+ # See: #insert_page
150
155
  def add_page(page = nil)
151
156
  insert_page(-1, page)
152
157
  end
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -88,7 +88,7 @@ module HexaPDF
88
88
  # Updates the second part of the /ID field (the first part should always be the same for a
89
89
  # PDF file, the second part should change with each write).
90
90
  def update_id
91
- if !value[:ID]
91
+ if !value[:ID].kind_of?(Array)
92
92
  set_random_id
93
93
  else
94
94
  value[:ID][1] = Digest::MD5.digest(rand.to_s)
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -119,20 +119,24 @@ module HexaPDF
119
119
  def parse_xref_section(index, w)
120
120
  xref = XRefSection.new
121
121
 
122
- entry_size = w.inject(:+)
123
122
  data = stream
124
- pos_in_data = 0
123
+ start_pos = end_pos = 0
125
124
 
126
- index.each_slice(2) do |first_oid, number_of_entries|
127
- number_of_entries.times do |i|
128
- oid = first_oid + i
125
+ w0 = w[0]
126
+ w1 = w[1]
127
+ w2 = w[2]
129
128
 
129
+ index.each_slice(2) do |first_oid, number_of_entries|
130
+ first_oid.upto(first_oid + number_of_entries - 1) do |oid|
130
131
  # Default for first field: type 1
131
- type_field = (w[0] == 0 ? TYPE_IN_USE : bytes_to_int(data, pos_in_data, w[0]))
132
+ end_pos = start_pos + w0
133
+ type_field = (w0 == 0 ? TYPE_IN_USE : bytes_to_int(data, start_pos, end_pos))
132
134
  # No default available for second field
133
- field2 = bytes_to_int(data, pos_in_data + w[0], w[1])
135
+ start_pos = end_pos + w1
136
+ field2 = bytes_to_int(data, end_pos, start_pos)
134
137
  # Default for third field is 0 for type 1, otherwise it needs to be specified!
135
- field3 = bytes_to_int(data, pos_in_data + w[0] + w[1], w[2])
138
+ end_pos = start_pos + w2
139
+ field3 = (w2 == 0 ? 0 : bytes_to_int(data, start_pos, end_pos))
136
140
 
137
141
  case type_field
138
142
  when TYPE_IN_USE
@@ -144,22 +148,22 @@ module HexaPDF
144
148
  else
145
149
  nil # Ignore entry as per PDF1.7 s7.5.8.3
146
150
  end
147
- pos_in_data += entry_size
151
+ start_pos = end_pos
148
152
  end
149
153
  end
150
154
 
151
155
  xref
152
156
  end
153
157
 
154
- # Converts +length+ bytes from the +start+ index from the +string+ to an integer.
158
+ # Converts the bytes of the string from the start index to the end index to an integer.
155
159
  #
156
- # The bytes are converted in the big-endian way. If +length+ is zero, zero is returned.
157
- def bytes_to_int(string, start, length)
158
- result = 0
159
- end_index = start + length
160
- while start < end_index
161
- result = (result << 8) | string.getbyte(start)
162
- start += 1
160
+ # The bytes are converted in the big-endian way.
161
+ def bytes_to_int(string, start_index, end_index)
162
+ result = string.getbyte(start_index)
163
+ start_index += 1
164
+ while start_index < end_index
165
+ result = (result << 8) | string.getbyte(start_index)
166
+ start_index += 1
163
167
  end
164
168
  result
165
169
  end
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -38,9 +38,9 @@ module HexaPDF
38
38
 
39
39
  # Helper class for reading variable length integers from a bit stream.
40
40
  #
41
- # This class allows one to read integers with a variable width of up to 16 bit from a bit
42
- # stream using the #read method. The data from where these bits are read, can be set on
43
- # intialization and additional data can later be appended.
41
+ # This class allows one to read integers with a variable width from a bit stream using the #read
42
+ # method. The data from where these bits are read, can be set on intialization and additional
43
+ # data can later be appended.
44
44
  class BitStreamReader
45
45
 
46
46
  # Creates a new object, optionally providing the string from where the bits should be read.
@@ -53,9 +53,12 @@ module HexaPDF
53
53
 
54
54
  # Appends some data to the string from where bits are read.
55
55
  def append_data(str)
56
- @data = @data[@pos, @data.length - @pos] << str
56
+ @data.slice!(0, @pos)
57
+ @data << str
57
58
  @pos = 0
59
+ self
58
60
  end
61
+ alias :<< :append_data
59
62
 
60
63
  # Returns the number of remaining bits that can be read.
61
64
  def remaining_bits
@@ -64,43 +67,24 @@ module HexaPDF
64
67
 
65
68
  # Returns +true+ if +bits+ number of bits can be read.
66
69
  def read?(bits)
67
- fill_bit_cache
68
- @available_bits >= bits
70
+ remaining_bits >= bits
69
71
  end
70
72
 
71
73
  # Reads +bits+ number of bits.
72
74
  #
73
- # Raises an exception if not enough bits are available for reading.
75
+ # Returns +nil+ if not enough bits are available for reading.
74
76
  def read(bits)
75
- fill_bit_cache
76
- raise HexaPDF::Error, "Not enough bits available for reading" if @available_bits < bits
77
-
77
+ while @available_bits < bits
78
+ @bit_cache = (@bit_cache << 8) | (@data.getbyte(@pos) || return)
79
+ @pos += 1
80
+ @available_bits += 8
81
+ end
78
82
  @available_bits -= bits
79
- result = @bit_cache >> @available_bits
83
+ result = (@bit_cache >> @available_bits)
80
84
  @bit_cache &= (1 << @available_bits) - 1
81
-
82
85
  result
83
86
  end
84
87
 
85
- private
86
-
87
- LENGTH_TO_TYPE = {4 => 'N', 2 => 'n', 1 => 'C'}.freeze # :nodoc:
88
- FOUR_TO_INFINITY = 4..Float::INFINITY # :nodoc:
89
-
90
- # Fills the bit cache so that at least 16bit are available (if possible).
91
- def fill_bit_cache
92
- return unless @available_bits <= 16 && @pos != @data.size
93
-
94
- l = case @data.size - @pos
95
- when FOUR_TO_INFINITY then 4
96
- when 2, 3 then 2
97
- else 1
98
- end
99
- @bit_cache = (@bit_cache << 8 * l) | @data[@pos, l].unpack(LENGTH_TO_TYPE[l]).first
100
- @pos += l
101
- @available_bits += 8 * l
102
- end
103
-
104
88
  end
105
89
 
106
90
  # Helper class for writing out variable length integers one after another as bit stream.
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -34,6 +34,6 @@
34
34
  module HexaPDF
35
35
 
36
36
  # The version of HexaPDF.
37
- VERSION = '0.2.0'.freeze
37
+ VERSION = '0.3.0'.freeze
38
38
 
39
39
  end
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -103,6 +103,7 @@ module HexaPDF
103
103
  end
104
104
 
105
105
  trailer = rev.trailer.value.dup
106
+ trailer.delete(:XRefStm)
106
107
  if previous_xref_pos
107
108
  trailer[:Prev] = previous_xref_pos
108
109
  else
@@ -4,7 +4,7 @@
4
4
  # This file is part of HexaPDF.
5
5
  #
6
6
  # HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
7
- # Copyright (C) 2016 Thomas Leitner
7
+ # Copyright (C) 2014-2017 Thomas Leitner
8
8
  #
9
9
  # HexaPDF is free software: you can redistribute it and/or modify it
10
10
  # under the terms of the GNU Affero General Public License version 3 as
@@ -121,6 +121,11 @@ module HexaPDF
121
121
  self[oid, 0] = self.class.compressed_entry(oid, objstm, pos)
122
122
  end
123
123
 
124
+ # Merges the entries from the given cross-reference section into this one.
125
+ def merge!(xref_section)
126
+ xref_section.each {|oid, gen, data| self[oid, gen] = data}
127
+ end
128
+
124
129
  # :call-seq:
125
130
  # xref_section.each_subsection {|sub| block } -> xref_section
126
131
  # xref_section.each_subsection -> Enumerator
@@ -15,13 +15,19 @@ Using the hexapdf application the following tasks can be performed with PDF file
15
15
  .sp
16
16
  .PD 0
17
17
  .IP \(bu 4
18
- Extracting embedded files (see the \fBextract\fP command)
18
+ Extracting embedded files (see the \fBfiles\fP command)
19
+ .IP \(bu 4
20
+ Extracting images (see the \fBimages\fP command)
19
21
  .IP \(bu 4
20
22
  Showing general information of a PDF file (see the \fBinfo\fP command)
21
23
  .IP \(bu 4
22
24
  Inspecting the internal structure of a PDF file (see the \fBinspect\fP command)
23
25
  .IP \(bu 4
26
+ Merging multiple PDF files into one (see the \fBmerge\fP command)
27
+ .IP \(bu 4
24
28
  Modifying an existing PDF file (see the \fBmodify\fP command)
29
+ .IP \(bu 4
30
+ Optimizing the file size of a PDF file (see the \fBoptimize\fP command)
25
31
  .PD
26
32
  .P
27
33
  The application contains a built\-in \fBhelp\fP command that can be used to provide a quick reminder of a command\[u2019]s purpose and its options\.
@@ -33,93 +39,13 @@ Show the version of the hexapdf application and exit\.
33
39
  .P
34
40
  These options are available on every command (except if they are overridden):
35
41
  .TP
42
+ \fB\-\-[no\-]force\fP
43
+ Force overwriting existing files\. Default: \fIfalse\fP\&\.
44
+ .TP
36
45
  \fB\-h\fP, \fB\-\-help\fP
37
46
  Show the help for the application if no command was specified, or the command help otherwise\.
38
- .SH "COMMANDS"
39
- hexapdf uses a command\-style interface\. This means that it provides different functionalities depending on the used command, and each command can have its own options\.
40
- .P
41
- There is no need to write the full command name for hexapdf to understand it, the only requirement is that is must be unambiguous\. So using \fBe\fP for the \fBextract\fP command is sufficient\.
42
- .SS "extract"
43
- Synopsis: \fBextract\fP [\fBOPTIONS\fP] \fIFILE\fP
44
- .P
45
- This command extracts embedded files from the PDF \fIFILE\fP\&\. If the \fB\-\-indices\fP option is not specified, the names and indices of the embedded files are just listed\.
46
- .TP
47
- \fB\-i\fP \fIA,B,C,\.\.\.\fP, \fB\-\-indices\fP \fIA,B,C,\.\.\.\fP
48
- The indices of the embedded files that should be extract\. The value \fI0\fP can be used to extract all embedded files\.
49
- .TP
50
- \fB\-s\fP, \fB\-\-[no\-]search\fP
51
- Search the whole PDF file instead of the standard locations, that is files attached to the document as a whole or to an individual page\. Defaults to \fIfalse\fP\&\.
52
- .TP
53
- \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
54
- The password to decrypt the PDF \fIFILE\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
55
- .SS "help"
56
- Synopsis: \fBhelp\fP [\fICOMMAND\fP\.\.\.]
57
- .P
58
- This command prints the application help if no arguments are given\. If one or more command names are given as arguments, these arguments are interpreted as a list of commands with sub\-commands and the help for the innermost command is shown\.
59
- .SS "info"
60
- Synopsis: \fBinfo\fP [\fBOPTIONS\fP] \fIFILE\fP
61
- .P
62
- This command reads the \fIFILE\fP and shows general information about it, like author information, PDF version used, encryption information and so on\.
63
- .TP
64
- \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
65
- The password to decrypt the PDF \fIFILE\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
66
- .SS "inspect"
67
- Synopsis: \fBinspect\fP [\fBOPTIONS\fP] \fIFILE\fP
68
- .P
69
- This command is useful when one needs to inspect the internal object structure or a stream of a PDF file\.
70
- .P
71
- If no option is given, the main PDF object, the catalog, is shown\. Otherwise the various, mutually exclusive display options define what is shown\. If multiple such options are specified only the last one is respected\. Note that PDF objects are always shown in the native PDF syntax\.
72
- .TP
73
- \fB\-t\fP, \fB\-\-trailer\fP
74
- Show the trailer dictionary\.
75
- .TP
76
- \fB\-c\fP, \fB\-\-page\-count\fP
77
- Print the number of pages\.
78
- .TP
79
- \fB\-\-pages\fP [\fIPAGES\fP]
80
- Show the pages with their object and generation numbers and their associated content streams\. If a range is specified, only those pages are listed\. See the \fBPAGES SPECIFICATION\fP below for details on the allowed format of \fIPAGES\fP\&\.
81
- .TP
82
- \fB\-o\fP \fIOID\fP[,\fIGEN\fP], \fB\-\-object\fP \fIOID\fP[,\fIGEN\fP]
83
- Show the object with the given object and generation numbers\. The generation number defaults to 0 if not given\.
84
- .TP
85
- \fB\-s\fP \fIOID\fP[,\fIGEN\fP], \fB\-\-stream\fP \fIOID\fP[,\fIGEN\fP]
86
- Show the filtered stream data (add \fB\-\-raw\fP to get the raw stream data) of the object with the given object and generation numbers\. The generation number defaults to 0 if not given\.
87
- .TP
88
- \fB\-\-raw\fP
89
- Modifies \fB\-\-stream\fP to show the raw stream data instead of the filtered one\.
90
- .TP
91
- \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
92
- The password to decrypt the PDF \fIFILE\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
93
- .SS "modify"
94
- Synopsis: \fBmodify\fP [\fBOPTIONS\fP] { \fB\-\-f\fP \fIINPUT\fP | \fB\-\-empty\fP } \fIOUTPUT\fP
95
- .P
96
- This command modifies a PDF file\. It can be used to select pages that should appear in the output file and to add pages from other PDF files (i\.e\. merging PDF files)\. The output file can be encrypted/decrypted and optimized in various ways\.
97
- .P
98
- The first input file is the primary file which gets modified, so meta data like file information, outlines, etc\. are taken from it\. Alternatively, it is possible to start with an empty PDF file by using \fB\-\-empty\fP\&\. The order of the options specifying the input files is important as the pages are added in that order\. Note that the \fB\-\-password\fP and \fB\-\-pages\fP options always apply to the last preceeding input file\.
99
- .P
100
- An input file can be specified multiple times, using a different \fB\-\-pages\fP option each time\. The \fB\-\-password\fP option, if needed, only needs to be used the first time\.
101
- .P
102
- Input file related options:
103
- .TP
104
- \fB\-f\fP \fIFILE\fP, \fB\-\-file\fP \fIFILE\fP
105
- An input file\. At least one input file or \fB\-\-empty\fP needs be used\.
106
- .TP
107
- \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
108
- The password to decrypt the last input file specified with \fB\-\-file\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
109
- .TP
110
- \fB\-i\fP \fIPAGES\fP, \fB\-\-pages\fP \fIPAGES\fP
111
- The pages (optionally rotated) from the last input file specified with the \fB\-\-file\fP option that should be included in the \fIOUTPUT\fP\&\. See the \fBPAGES SPECIFICATION\fP below for details on the allowed format of \fIPAGES\fP\&\. Default: \fI1\-e\fP (i\.e\. all pages with no additional rotation applied)\.
112
- .TP
113
- \fB\-e\fP, \fB\-\-empty\fP
114
- Use an empty file as primary file\. This will lead to an output file that just contains the included pages of the input file and no other data from the input files\.
115
- .TP
116
- \fB\-\-interleave\fP
117
- Interleave the pages from the input files: Takes the first specified page from the first input file, then the first specified page from the second input file, and so on\. After that the same with the second, third, \.\.\. specified pages\. If fewer pages were specified for an input file, the input file is just skipped for the rest of the rounds\.
118
- .P
119
- Output file related options:
120
- .TP
121
- \fB\-\-embed\fP \fIFILE\fP
122
- Embed the given file into the \fIOUTPUT\fP using built\-in features of PDF\. This option can be used multiple times to embed more than one file\.
47
+ .SS "Optimization Options"
48
+ Theses options can only be used with the \fBmerge\fP, \fBmodify\fP and \fBoptimize\fP commands and control optimization aspects when writing an output PDF file\. Note that the defaults maybe different depending on the command\.
123
49
  .TP
124
50
  \fB\-\-[no\-]compact\fP
125
51
  Delete unnecessary PDF objects\. This includes merging the base revision and all incremental updates into a single revision\. Default: \fIyes\fP\&\.
@@ -135,8 +61,10 @@ Defines how streams should be treated: \fIcompress\fP will compress them when po
135
61
  .TP
136
62
  \fB\-\-[no\-]compress\-pages\fP
137
63
  Recompress page content streams\. This is a very expensive operation in terms of processing time and won\[u2019]t lead to great file size improvements in many cases\. Default: \fIno\fP\&\.
64
+ .SS "Encryption Options"
65
+ These options can only be used with the \fBmerge\fP and \fBmodify\fP commands and control if and how an output PDF file should be encrypted\. All options except \fB\-\-decrypt\fP automatically enable \fB\-\-encrypt\fP\&\.
138
66
  .P
139
- Output file encryption related options (all options except \fB\-\-decrypt\fP automatically enable \fB\-\-encrypt\fP):
67
+ Note that if a password is needed to open the input file and if encryption parameters are changed, the provided password is not automatically used for the output file!
140
68
  .TP
141
69
  \fB\-\-decrypt\fP
142
70
  Remove any encryption\.
@@ -153,7 +81,7 @@ If neither \fB\-\-decrypt\fP nor \fB\-\-encrypt\fP are specified, the existing e
153
81
  .RE
154
82
  .TP
155
83
  \fB\-\-owner\-password\fP \fIPASSWORD\fP
156
- The owner password to be set on the \fIOUTPUT\fP\&\. This password is needed when operations not allowed by the permissions need to be done\. It can also be used when opening the PDF file\.
84
+ The owner password to be set on the output file\. This password is needed when operations not allowed by the permissions need to be done\. It can also be used when opening the PDF file\.
157
85
  .RS
158
86
  .P
159
87
  If an owner password is set but no user password, the output file can be opened without a password but the operations are restricted as if a user password were set\.
@@ -162,14 +90,14 @@ Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
162
90
  .RE
163
91
  .TP
164
92
  \fB\-\-user\-password\fP \fIPASSWORD\fP
165
- The user password to be set on the \fIOUTPUT\fP\&\. This password is needed when opening the PDF file\. The application should restrict the operations to those allowed by the permissions\.
93
+ The user password to be set on the output file\. This password is needed when opening the PDF file\. The application should restrict the operations to those allowed by the permissions\.
166
94
  .RS
167
95
  .P
168
96
  Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
169
97
  .RE
170
98
  .TP
171
99
  \fB\-\-algorithm\fP \fIALGORITHM\fP
172
- The encryption algorithm to use on the \fIOUTPUT\fP\&\. Allowed algorithms are \fIaes\fP and \fIarc4\fP but \fIarc4\fP should only be used if it is absolutely necessary for compatibility reasons\. Default: \fIaes\fP\&\.
100
+ The encryption algorithm to use on the output file\. Allowed algorithms are \fIaes\fP and \fIarc4\fP but \fIarc4\fP should only be used if it is absolutely necessary for compatibility reasons\. Default: \fIaes\fP\&\.
173
101
  .TP
174
102
  \fB\-\-key\-length\fP \fIBITS\fP
175
103
  The length of the encryption key in bits\. The allowed values differ based on the chosen algorithm: A number divisible by eight between 40 to 128 for \fIarc4\fP and 128 or 256 for \fIaes\fP\&\. Default: \fB128\fP\&\.
@@ -182,7 +110,7 @@ Note: Using 256bit AES encryption can lead to problems viewing the PDF in many a
182
110
  Force the use of PDF encryption version 4 if key length is \fI128\fP and algorithm is \fIarc4\fP\&\. This option is probably only useful for testing the implementation of PDF libraries\[u2019] encryption handling\.
183
111
  .TP
184
112
  \fB\-\-permissions\fP \fIPERMS\fP
185
- A comma separated list of permissions to be set on the \fIOUTPUT\fP:
113
+ A comma separated list of permissions to be set on the output file:
186
114
  .RS
187
115
  .TP
188
116
  \fIprint\fP
@@ -209,6 +137,159 @@ allow page modifications and bookmark creation
209
137
  \fIhigh_quality_print\fP
210
138
  allow high quality printing
211
139
  .RE
140
+ .SH "COMMANDS"
141
+ hexapdf uses a command\-style interface\. This means that it provides different functionalities depending on the used command, and each command can have its own options\.
142
+ .P
143
+ There is no need to write the full command name for hexapdf to understand it, the only requirement is that is must be unambiguous\. So using \fBf\fP for the \fBfiles\fP command is sufficient\. The same is true for long option names and option values\.
144
+ .SS "files"
145
+ Synopsis: \fBfiles\fP [\fBOPTIONS\fP] \fIPDF\fP
146
+ .P
147
+ This command extracts embedded files from the \fIPDF\fP\&\. If the \fB\-\-extract\fP option is not specified, the indices and names of the embedded files are just listed\.
148
+ .TP
149
+ \fB\-e\fP [\fIA,B,C,\.\.\.\fP], \fB\-\-extract\fP [\fIA,B,C,\.\.\.\fP]
150
+ The indices of the embedded files that should be extracted\. The value \fI0\fP can be used to extract all embedded files\.
151
+ .TP
152
+ \fB\-s\fP, \fB\-\-[no\-]search\fP
153
+ Search the whole PDF file instead of the standard locations, that is files attached to the document as a whole or to an individual page\. Defaults to \fIfalse\fP\&\.
154
+ .TP
155
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
156
+ The password to decrypt the \fIPDF\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
157
+ .SS "help"
158
+ Synopsis: \fBhelp\fP [\fICOMMAND\fP\.\.\.]
159
+ .P
160
+ This command prints the application help if no arguments are given\. If one or more command names are given as arguments, these arguments are interpreted as a list of commands with sub\-commands and the help for the innermost command is shown\.
161
+ .SS "images"
162
+ Synopsis: \fBimages\fP [\fBOPTIONS\fP] \fIPDF\fP
163
+ .P
164
+ This command extracts images from the \fIPDF\fP\&\. If the \fB\-\-extract\fP option is not specified, the images are listed with their indices and additional information, sorted by page number\. The \fB\-\-extract\fP option can then be used to extract one or more images, saving them to files called \fBPREFIX\-N\.EXT\fP where the prefix can be set via \fB\-\-prefix\fP, \fIN\fP is the image index and \fIEXT\fP is either png, jpg or jpx\.
165
+ .TP
166
+ \fB\-e\fP [\fIA,B,C,\.\.\.\fP], \fB\-\-extract\fP [\fIA,B,C,\.\.\.\fP]
167
+ The indices of the images that should be extracted\. Use \fI0\fP or no value to extract all images\.
168
+ .TP
169
+ \fB\-\-prefix\fP \fIPREFIX\fP
170
+ The prefix to use when saving images\. May include directories\. Defaults to \fIimage\fP\&\.
171
+ .TP
172
+ \fB\-s\fP, \fB\-\-[no\-]search\fP
173
+ Search the whole PDF file instead of the standard locations, that is, images referenced by pages\. Defaults to \fIfalse\fP\&\.
174
+ .TP
175
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
176
+ The password to decrypt the \fIPDF\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
177
+ .P
178
+ The following information is shown for each image when listing images:
179
+ .RS
180
+ .TP
181
+ \fBindex\fP
182
+ The image index needed when this image should be extracted\.
183
+ .TP
184
+ \fBpage\fP
185
+ The page number on which this image appears\.
186
+ .TP
187
+ \fBoid\fP
188
+ The PDF internal object identifier consisting of the object and generation numbers\.
189
+ .TP
190
+ \fBwidth\fP
191
+ The width of the image in pixels\.
192
+ .TP
193
+ \fBheight\fP
194
+ The height of the image in pixels\.
195
+ .TP
196
+ \fBcolor\fP
197
+ The color space used for the image\. Either gray, rgb, cmyk or other\.
198
+ .TP
199
+ \fBcomp\fP
200
+ The number of color components\.
201
+ .TP
202
+ \fBbpc\fP
203
+ The number of bits per color component\.
204
+ .TP
205
+ \fBtype\fP
206
+ The image type\. Either jpg (JPEG), jp2 (JPEG2000), ccitt (CCITT Group 3 or 4 Fax), jbig2 (JBIG2) or png (PNG)\.
207
+ .TP
208
+ \fBwritable\fP
209
+ Either true or false depending on whether hexapdf supports the image format\.
210
+ .RE
211
+ .SS "info"
212
+ Synopsis: \fBinfo\fP [\fBOPTIONS\fP] \fIFILE\fP
213
+ .P
214
+ This command reads the \fIFILE\fP and shows general information about it, like author information, PDF version used, encryption information and so on\.
215
+ .TP
216
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
217
+ The password to decrypt the PDF \fIFILE\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
218
+ .SS "inspect"
219
+ Synopsis: \fBinspect\fP [\fBOPTIONS\fP] \fIFILE\fP
220
+ .P
221
+ This command is useful when one needs to inspect the internal object structure or a stream of a PDF file\.
222
+ .P
223
+ If no option is given, the PDF trailer is shown\. Otherwise the various, mutually exclusive display options define what is shown\. If multiple such options are specified only the last one is respected\. Note that PDF objects are always shown in the native PDF syntax\.
224
+ .TP
225
+ \fB\-\-catalog\fP
226
+ Show the PDF catalog dictionary\.
227
+ .TP
228
+ \fB\-c\fP, \fB\-\-page\-count\fP
229
+ Print the number of pages\.
230
+ .TP
231
+ \fB\-\-pages\fP [\fIPAGES\fP]
232
+ Show the pages with their object and generation numbers and their associated content streams\. If a range is specified, only those pages are listed\. See the \fBPAGES SPECIFICATION\fP below for details on the allowed format of \fIPAGES\fP\&\.
233
+ .TP
234
+ \fB\-o\fP \fIOID\fP[,\fIGEN\fP], \fB\-\-object\fP \fIOID\fP[,\fIGEN\fP]
235
+ Show the object with the given object and generation numbers\. The generation number defaults to 0 if not given\.
236
+ .TP
237
+ \fB\-s\fP \fIOID\fP[,\fIGEN\fP], \fB\-\-stream\fP \fIOID\fP[,\fIGEN\fP]
238
+ Show the filtered stream data (add \fB\-\-raw\fP to get the raw stream data) of the object with the given object and generation numbers\. The generation number defaults to 0 if not given\.
239
+ .TP
240
+ \fB\-\-raw\fP
241
+ Modifies \fB\-\-stream\fP to show the raw stream data instead of the filtered one\.
242
+ .TP
243
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
244
+ The password to decrypt the PDF \fIFILE\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
245
+ .SS "merge"
246
+ Synopsis: \fBmerge\fP [\fBOPTIONS\fP] { \fIINPUT\fP | \fB\-\-empty\fP } [\fIINPUT\fP]\.\.\. \fIOUTPUT\fP
247
+ .P
248
+ This command merges pages from multiple PDFs into one output file which can optionally be encrypted/decrypted and optimized in various ways\.
249
+ .P
250
+ The first input file is the primary file from which meta data like file information, outlines, etc\. are taken from\. Alternatively, it is possible to start with an empty PDF file by using \fB\-\-empty\fP\&\. The order of the input files is important as the pages are added in that order\. Note that the \fB\-\-password\fP and \fB\-\-pages\fP options always apply to the last preceeding input file\.
251
+ .P
252
+ An input file can be specified multiple times, using a different \fB\-\-pages\fP option each time\. The \fB\-\-password\fP option, if needed, only needs to be used the first time\.
253
+ .TP
254
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
255
+ The password to decrypt the last input file\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
256
+ .TP
257
+ \fB\-i\fP \fIPAGES\fP, \fB\-\-pages\fP \fIPAGES\fP
258
+ The pages (optionally rotated) from the last input file that should be included in the \fIOUTPUT\fP\&\. See the \fBPAGES SPECIFICATION\fP below for details on the allowed format of \fIPAGES\fP\&\. Default: \fI1\-e\fP (i\.e\. all pages with no additional rotation applied)\.
259
+ .TP
260
+ \fB\-e\fP, \fB\-\-empty\fP
261
+ Use an empty file as primary file\. This will lead to an output file that just contains the included pages of the input file and no other data from the input files\.
262
+ .TP
263
+ \fB\-\-interleave\fP
264
+ Interleave the pages from the input files: Takes the first specified page from the first input file, then the first specified page from the second input file, and so on\. After that the same with the second, third, \.\.\. specified pages\. If fewer pages were specified for an input file, the input file is just skipped for the rest of the rounds\.
265
+ .P
266
+ Additionally, the \fBOptimization Options\fP and \fBEncryption Options\fP can be used\.
267
+ .SS "modify"
268
+ Synopsis: \fBmodify\fP [\fBOPTIONS\fP] \fIINPUT\fP \fIOUTPUT\fP
269
+ .P
270
+ This command modifies a PDF file\. It can be used to select pages that should appear in the output file and/or rotate them\. The output file can also be encrypted/decrypted and optimized in various ways\.
271
+ .TP
272
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
273
+ The password to decrypt the \fIINPUT\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
274
+ .TP
275
+ \fB\-i\fP \fIPAGES\fP, \fB\-\-pages\fP \fIPAGES\fP
276
+ The pages (optionally rotated) from the \fIINPUT\fP that should be included in the \fIOUTPUT\fP\&\. See the \fBPAGES SPECIFICATION\fP below for details on the allowed format of \fIPAGES\fP\&\. Default: \fI1\-e\fP (i\.e\. all pages with no additional rotation applied)\.
277
+ .TP
278
+ \fB\-e\fP \fIFILE\fP, \fB\-\-embed\fP \fIFILE\fP
279
+ Embed the given file into the \fIOUTPUT\fP using built\-in features of PDF\. This option can be used multiple times to embed more than one file\.
280
+ .P
281
+ Additionally, the \fBOptimization Options\fP and \fBEncryption Options\fP can be used\.
282
+ .SS "optimize"
283
+ Synopsis: \fBoptimize\fP [\fBOPTIONS\fP] \fIINPUT\fP \fIOUTPUT\fP
284
+ .P
285
+ This command uses several optimization strategies to reduce the file size of the PDF file\.
286
+ .P
287
+ By default, all strategies except page compression are used since page compression may take a very long time without much benefit\.
288
+ .TP
289
+ \fB\-p\fP \fIPASSWORD\fP, \fB\-\-password\fP \fIPASSWORD\fP
290
+ The password to decrypt the \fIINPUT\fP\&\. Use \fB\-\fP for \fIPASSWORD\fP for reading it from standard input\.
291
+ .P
292
+ The \fBOptimization Options\fP can be used with this command\. Note that the defaults are changed to provide good compression out of the box\.
212
293
  .SS "version"
213
294
  This command shows the version of the hexapdf application\. It is an alternative to using the global \fB\-\-version\fP option\.
214
295
  .SH "PAGES SPECIFICATION"
@@ -252,43 +333,41 @@ Examples:
252
333
  \fB10\-1/3\fP: The pages 10, 7, 4 and 1\.
253
334
  .IP \(bu 4
254
335
  \fB1l,2r,3\-5d,6n\fP: The pages 1 (rotated left), 2 (rotated right), 3 to 5 (all rotated 180 degrees) and 6 (any possibly set rotation removed)\.
255
- .SH "Examples"
256
- .SS "modify"
257
- \fBhexapdf modify \-\-file input\.pdf \-\-object\-stream generate output\.pdf\fP
258
- .br
259
- \fBhexapdf m \-f input\.pdf \-\-obj g output\.pdf\fP
260
- .P
261
- Compressing: Both commands do exactly the same, compressing the \fBinput\.pdf\fP to get a smaller file size\. Howver, the second command uses the feature of abbreviating commands and options to their shortest unambiguous name so that less key presses are required\.
262
- .P
263
- \fBhexapdf modify \-f input1\.pdf \-f input2\.pdf \-f input3\.pdf output\.pdf\fP
336
+ .SH "EXAMPLES"
337
+ .SS "merge"
338
+ \fBhexapdf merge input1\.pdf input2\.pdf input3\.pdf output\.pdf\fP
264
339
  .br
265
- \fBhexapdf modify \-e \-f input1\.pdf \-f input2\.pdf \-f input3\.pdf output\.pdf\fP
340
+ \fBhexapdf merge \-e input1\.pdf input2\.pdf input3\.pdf output\.pdf\fP
266
341
  .P
267
342
  Merging: In the first case use \fBinput1\.pdf\fP as primary input file and merge the pages from \fBinput2\.pdf\fP and \fBinput3\.pdf\fP into it\. In the second case an empty PDF file is used for merging the pages from the three given input files into it; the resulting output file will not have an meta data or other additional data from the first input file\.
268
343
  .P
269
- \fBhexapdf modify \-f odd\.pdf \-f even\.pdf \-\-interleave combined\.pdf\fP
270
- .P
271
- Page interleaving: Takes alternatly a page from \fBodd\.pdf\fP and \fBeven\.pdf\fP to create the output file\. This is very useful if you only have a simplex scanner: First you scan the front sides, creating \fBodd\.pdf\fP, and then you scan the back sides, creating \fBeven\.pdf\fP\&\. With the command the pages can be ordered in the correct way\.
344
+ \fBhexapdf merge odd\.pdf even\.pdf \-\-interleave combined\.pdf\fP
272
345
  .P
273
- \fBhexapdf modify \-f input\.pdf \-i 1\-5,7\-10,12\-e output\.pdf\fP
346
+ Page interleaving: Takes alternately a page from \fBodd\.pdf\fP and \fBeven\.pdf\fP to create the output file\. This is very useful if you only have a simplex scanner: First you scan the front sides, creating \fBodd\.pdf\fP, and then you scan the back sides, creating \fBeven\.pdf\fP\&\. With the command the pages can be ordered in the correct way\.
347
+ .SS "modify"
348
+ \fBhexapdf modify input\.pdf \-i 1\-5,7\-10,12\-e output\.pdf\fP
274
349
  .P
275
- Page removal: Remove the pages 6 and 12 from the \fBinput\.pdf\fP\&\.
350
+ Page removal: Remove the pages 6 and 11 from the \fBinput\.pdf\fP\&\.
276
351
  .P
277
- \fBhexapdf modify \-f input\.pdf \-i 1r,2\-ed output\.pdf\fP
352
+ \fBhexapdf modify input\.pdf \-i 1r,2\-ed output\.pdf\fP
278
353
  .P
279
354
  Page rotation: Rotate the first page to the right, that is 90 degrees clockwise, and all other pages 180 degrees\.
280
355
  .P
281
- \fBhexapdf modify \-f input\.pdf \-\-user\-password my_pwd \-\-permissions print output\.pdf\fP
356
+ \fBhexapdf modify input\.pdf \-\-user\-password my_pwd \-\-permissions print output\.pdf\fP
357
+ .P
358
+ Encryption: Create the \fBoutput\.pdf\fP from the \fBinput\.pdf\fP so that a password is needed to open it, and only allow printing\.
282
359
  .P
283
- Encryption: Encrypt the \fBoutput\.pdf\fP so that a password is needed to open it, and only allow printing\.
360
+ \fBhexapdf modify input\.pdf \-p input_password \-\-decrypt output\.pdf\fP
284
361
  .P
285
- \fBhexapdf modify \-f input\.pdf \-p input_password \-\-decrypt output\.pdf\fP
362
+ Encryption removal: Create the \fBoutput\.pdf\fP as copy of \fBinput\.pdf\fP but with the encryption removed\. If the \fB\-\-decrypt\fP was not used, the output file would retain the encryption specification of the input file\.
363
+ .SS "optimize"
364
+ \fBhexapdf optimize input\.pdf output\.pdf\fP
286
365
  .P
287
- Encryption removal: Create the \fBoutput\.pdf\fP as copy of \fBinput\.pdf\fP but with the encryption removed\. If the \fB\-\-decrypt\fP was not used, the output file would retain the encryption specification of the primary input file\.
288
- .SS "extract"
289
- \fBhexapdf extract input\.pdf\fP
366
+ Optimization: Compress the \fBinput\.pdf\fP to get a smaller file size\.
367
+ .SS "files"
368
+ \fBhexapdf files input\.pdf\fP
290
369
  .br
291
- \fBhexapdf extract input\.pdf \-i 1\fP
370
+ \fBhexapdf files input\.pdf \-i 1\fP
292
371
  .P
293
372
  Embedded files: The first command lists the embedded files in the \fBinput\.pdf\fP, the second one then extracts the embedded file with the index 1\.
294
373
  .SS "info"
@@ -300,7 +379,7 @@ File information: Show general information about the PDF file, like PDF version,
300
379
  .br
301
380
  \fBhexapdf inspect input\.pdf \-o 3\fP
302
381
  .P
303
- Inspect a PDF: These commands can be used to inspect the internal object structure of a PDF file\. The first command shows the PDF catalog object, the main object of a PDF file\. The second one shows the object with the object number 3\.
382
+ Inspect a PDF: These commands can be used to inspect the internal object structure of a PDF file\. The first command shows the PDF trailer object\. The second one shows the object with the object number 3\.
304
383
  .SH "EXIT STATUS"
305
384
  The exit status is 0 if no error happened\. Otherwise it is 1\.
306
385
  .SH "SEE ALSO"