RubyGems - hexapdf - Versions diffs - 1.4.1 → 1.6.0 - Mend

hexapdf 1.4.1 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +48 -0
data/README.md +8 -7
data/examples/022-outline.rb +5 -1
data/lib/hexapdf/cli/debug_info.rb +98 -0
data/lib/hexapdf/cli/images.rb +13 -2
data/lib/hexapdf/cli/inspect.rb +5 -1
data/lib/hexapdf/cli.rb +2 -0
data/lib/hexapdf/dictionary.rb +7 -1
data/lib/hexapdf/digital_signature/cms_handler.rb +5 -1
data/lib/hexapdf/digital_signature/signing/timestamp_handler.rb +24 -4
data/lib/hexapdf/encryption/security_handler.rb +3 -1
data/lib/hexapdf/font/cmap.rb +10 -6
data/lib/hexapdf/parser.rb +29 -4
data/lib/hexapdf/revision.rb +6 -2
data/lib/hexapdf/type/acro_form/field.rb +4 -1
data/lib/hexapdf/type/annotation.rb +1 -1
data/lib/hexapdf/version.rb +1 -1
data/test/hexapdf/digital_signature/common.rb +6 -1
data/test/hexapdf/digital_signature/signing/test_timestamp_handler.rb +12 -0
data/test/hexapdf/digital_signature/test_cms_handler.rb +6 -0
data/test/hexapdf/encryption/test_security_handler.rb +7 -5
data/test/hexapdf/test_dictionary.rb +15 -0
data/test/hexapdf/test_document.rb +2 -2
data/test/hexapdf/test_parser.rb +55 -3
data/test/hexapdf/test_revision.rb +27 -6
data/test/hexapdf/type/acro_form/test_field.rb +5 -0
data/test/hexapdf/type/test_annotation.rb +3 -0
data/test/test_helper.rb +6 -0
metadata +20 -5

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: b88ce85ee9bc603011b9f5a278829d588da10f53614c0b84a57e2d7fa38f52dc
-  data.tar.gz: ec2c8739ed69038e1297435550371bf329e516e9ef970fa0456502b15720d07b
+  metadata.gz: 35bbb5d1780d07ecf6098cc40359ff2cc02cd89231a124b6ff1a0a13c760d116
+  data.tar.gz: 8664f2ac8a6651ee83e7292d005ea10d89b7ea738de47cc62dbf219f4eae0cb4
 SHA512:
-  metadata.gz: 103edc366ef9f48ddd6579f7137b3ab23b4266dc2df0a77ee5b89cb4256419a00727776158a7c7570a0b10075e4f062506d524ec71ee801c00fdd9e4726c8232
-  data.tar.gz: f1f4a1af54445b2e7c3c9fc1adfa81fb9fad84f32461f58378fc550bd5aec16e16b276dadc7516ca5b0b6212394886d06f2cc2a3506dc2b2745b5a1f4c8136d1
+  metadata.gz: 232aefc90eb4f9f9a913d27affa95a0c9eff43a72e04eeb1adc0fbe11e865033c6fd0b7779930b15a982afdd909d6ffa98640db6db668f95ce0c26332749cfae
+  data.tar.gz: e1b836a23d58e92ceb70f5b892d023edcf585288583f2254d35394688204bfdbf4401edea6562a96d1583a71a302d8d50e8a175262ff5077a3b4a2200ec922a4

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,51 @@
+## 1.6.0 - 2026-02-10
+### Added
+* CLI command `hexapdf debug-info` for creating debugging information,
+  especially for malformed files
+### Changed
+* Optimized decoding character codes with a CMap to drastically lower memory
+  usage
+* CLI command `hexapdf inspect rev` to show whether the cross-reference table
+  was reconstructed
+### Fixed
+* Path generation for image extraction in CLI command `hexapdf images`
+* Handling of certain invalid PDFs where the generation number for object
+  identifiers don't match their cross-reference section value
+* AES 256bit encryption to include unnecessary field /Length in encryption
+  dictionary to work around buggy PDF libraries
+* Parsing of invalid /Filter and /DecodeParms stream keys in case they resolve
+  to a recursive structure
+* [HexaPDF::Type::AcroForm::Field#each_widget] to only yield widget objects
+## 1.5.0 - 2025-12-08
+### Added
+* Support for basic authentication to
+  [HexaPDF::DigitalSignature::Signing::TimestampHandler]
+### Changed
+* Dictionary validation to delete field entries that have an invalid type
+* CLI command `hexapdf images` to create directories specified in the `--prefix`
+* CLI command `hexapdf images` to omit the dash in the file names if `--prefix`
+  points to a directory
+## Fixed
+* [HexaPDF::Type::Annotation#appearance] to work in case /AP contains a value of
+  an invalid type
+* [HexaPDF::DigitalSignature::CMSHandler] to throw an appropriate error when
+  encountering invalid signature contents
 ## 1.4.1 - 2025-09-23
 ### Added

data/README.md CHANGED Viewed

@@ -13,7 +13,7 @@ In short, it allows
 * **securing** PDF files by encrypting or signing them and
 * **optimizing** PDF files for smaller file size or other criteria.
-HexaPDF is available under two license, the AGPL and a commercial license, see the [License
+HexaPDF is available under two licenses, the AGPL and a commercial license, see the [License
 section](#License) for details.
@@ -93,12 +93,13 @@ with example graphics and PDF files and tightly integrated into the rest of the
 ## Requirements and Installation
-Since HexaPDF is written in Ruby, a working Ruby installation is needed - see the
-[official installation documentation][rbinstall] for details. Note that you need Ruby version 2.6 or
-higher as prior versions are not supported!
+Since HexaPDF is written in Ruby, a working Ruby installation is needed - see the [official
+installation documentation][rbinstall] for details. Note that you need Ruby version 3.0 or higher as
+prior versions are not supported!
-HexaPDF works on all Ruby implementations that are CRuby compatible, e.g. TruffleRuby, and on any
-platform supported by Ruby (Linux, macOS, Windows, ...).
+HexaPDF works on all Ruby implementations that are CRuby compatible and on any platform supported by
+Ruby (Linux, macOS, Windows, ...). Implementations like JRuby and TruffleRuby should work but
+HexaPDF is not actively tested against them.
 Apart from Ruby itself the HexaPDF library has only one external dependency `geom2d` which is
 written and provided by the HexaPDF authors. The `hexapdf` application has an additional dependency
@@ -117,7 +118,7 @@ Prawn is a **library for generating content**.
 To be more specific, it is easily possible to read an existing PDF with HexaPDF and modify parts of
 it before writing it out again. The modifications can be to the PDF object structure like removing
-superfluous annotations or the the content itself.
+superfluous annotations or the content itself.
 Prawn has no such functionality. There is basic support for using a PDF as a template using the
 `pdf-reader` and `prawn-template` gems but support is very limited. However, Prawn has a very

data/examples/022-outline.rb CHANGED Viewed

@@ -10,7 +10,11 @@
 require 'hexapdf'
 doc = HexaPDF::Document.new
-6.times { doc.pages.add }
+6.times do |i|
+  doc.pages.add.canvas.
+    font("Helvetica", size: 150).
+    text("Page #{i + 1}", at: [10, 660])
+end
 doc.outline.add_item("Main") do |main|
   main.add_item("Page 1", destination: 0)

data/lib/hexapdf/cli/debug_info.rb ADDED Viewed

@@ -0,0 +1,98 @@
+# -*- encoding: utf-8; frozen_string_literal: true -*-
+#
+#--
+# This file is part of HexaPDF.
+#
+# HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
+# Copyright (C) 2014-2025 Thomas Leitner
+#
+# HexaPDF is free software: you can redistribute it and/or modify it
+# under the terms of the GNU Affero General Public License version 3 as
+# published by the Free Software Foundation with the addition of the
+# following permission added to Section 15 as permitted in Section 7(a):
+# FOR ANY PART OF THE COVERED WORK IN WHICH THE COPYRIGHT IS OWNED BY
+# THOMAS LEITNER, THOMAS LEITNER DISCLAIMS THE WARRANTY OF NON
+# INFRINGEMENT OF THIRD PARTY RIGHTS.
+#
+# HexaPDF is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public
+# License for more details.
+#
+# You should have received a copy of the GNU Affero General Public License
+# along with HexaPDF. If not, see <http://www.gnu.org/licenses/>.
+#
+# The interactive user interfaces in modified source and object code
+# versions of HexaPDF must display Appropriate Legal Notices, as required
+# under Section 5 of the GNU Affero General Public License version 3.
+#
+# In accordance with Section 7(b) of the GNU Affero General Public
+# License, a covered work must retain the producer line in every PDF that
+# is created or manipulated using HexaPDF.
+#
+# If the GNU Affero General Public License doesn't fit your need,
+# commercial licenses are available at <https://gettalong.at/hexapdf/>.
+#++
+require 'hexapdf/cli/command'
+module HexaPDF
+  module CLI
+    # Creates debugging information for adding to an issue.
+    class DebugInfo < Command
+      def initialize #:nodoc:
+        super('debug-info', takes_commands: false)
+        short_desc("Create debug information for a PDF file")
+        long_desc(<<~EOF)
+          Creates debug information for a possibly malformed PDF file that can be attached to an
+          issue.
+          Two files are created: anonymized-FILE where all strings are replaced with zeroes and
+          debug_info.txt with additional debug information.
+        EOF
+        options.on("--password PASSWORD", "-p", String,
+                   "The password for decryption. Use - for reading from standard input.") do |pwd|
+          @password = (pwd == '-' ? read_password : pwd)
+        end
+        @password = nil
+      end
+      def execute(file) #:nodoc:
+        output_name = "anonymized-#{file}"
+        puts "Creating anonymized file '#{output_name}'"
+        data = File.binread(file)
+        data.gsub!(/(>>\s*stream\s*)(.*?)(\s*endstream)/m) {|m| "#{$1}#{'0' * $2.length}#{$3}" }
+        data.gsub!(/([^<]<)([0-9A-Fa-f#{Tokenizer::WHITESPACE}]*?)>/m) {|m| "#{$1}#{'0' * $2.length}>" }
+        data.gsub!(/\((.*?)\)/m) {|m| "(#{'0' * $1.length})" }
+        File.binwrite(output_name, data)
+        debug_info = +''
+        puts "Collecting debug information in debug_info.txt"
+        begin
+          output = capture_output { HexaPDF::CLI::Application.new.parse(['info', '--check', file]) }
+          debug_info << "Output:\n"<< output
+        rescue
+          debug_info << "Error collecting info: #{$!.message}\n"
+        end
+        File.write('debug_info.txt', debug_info)
+      end
+      private
+      def capture_output
+        stdout, stderr = $stdout, $stderr
+        $stdout = $stderr = StringIO.new
+        yield
+        $stdout.string
+      ensure
+        $stdout, $stderr = stdout, stderr
+      end
+    end
+  end
+end

data/lib/hexapdf/cli/images.rb CHANGED Viewed

@@ -35,6 +35,7 @@
 #++
 require 'set'
+require 'fileutils'
 require 'hexapdf/cli/command'
 module HexaPDF
@@ -145,14 +146,23 @@ module HexaPDF
       # Extracts the images with the given indices.
       def extract_images(doc)
+        FileUtils.mkdir_p(File.dirname("#{@prefix}filename"))
+        prefix = File.directory?(@prefix) ? @prefix : "#{@prefix}-"
         done = Set.new
+        count = total = 0
         each_image(doc) do |image, index, _|
           next unless (@indices.include?(index) || @indices.include?(0)) && !done.include?(index)
+          total += 1
           info = image.info
           if info.writable
-            path = "#{@prefix}-#{index}.#{image.info.extension}"
+            count += 1
+            path = "#{prefix}#{index}.#{image.info.extension}"
             maybe_raise_on_existing_file(path)
-            puts "Extracting #{path}..." if command_parser.verbosity_info?
+            if command_parser.verbosity_info?
+              puts "Extracting image #{index} (#{image.width}x#{image.height}, " \
+                   "#{info.color_space}, #{info.type}) to #{path}..."
+            end
             image.write(path)
             done << index
             if info.color_space == :cmyk && info.type == :jpeg
@@ -163,6 +173,7 @@ module HexaPDF
             $stderr.puts "Warning (image #{index}): PDF image format not supported for writing"
           end
         end
+        puts "Created #{count} image files (out of #{total} selected)" if command_parser.verbosity_info?
       end
       # Iterates over all images.

data/lib/hexapdf/cli/inspect.rb CHANGED Viewed

@@ -293,6 +293,10 @@ module HexaPDF
               IO.copy_stream(@doc.revisions.parser.io, $stdout, length, 0)
             else
               puts "Document has #{@doc.revisions.count} revision#{@doc.revisions.count == 1 ? '' : 's'}"
+              if @doc.revisions.parser.reconstructed? && @doc.revisions.count == 1 &&
+                 @doc.revisions.current == @doc.revisions.parser.reconstructed_revision
+                puts "Document cross-reference table has been reconstructed"
+              end
               revision_information do |rev, index, count, signature, end_offset|
                 type = if rev.trailer[:XRefStm]
                          "xref table + stream"
@@ -415,7 +419,7 @@ module HexaPDF
           sig = signatures[rev]
           if sig
             end_index = sig[:ByteRange][-2] + sig[:ByteRange][-1]
-          else
+          elsif rev != @doc.revisions.parser.reconstructed_revision
             io.seek(startxrefs[index], IO::SEEK_SET)
             buffer = ''.b
             while io.pos < startxrefs[index + 1]

data/lib/hexapdf/cli.rb CHANGED Viewed

@@ -49,6 +49,7 @@ require 'hexapdf/cli/image2pdf'
 require 'hexapdf/cli/form'
 require 'hexapdf/cli/fonts'
 require 'hexapdf/cli/usage'
+require 'hexapdf/cli/debug_info'
 require 'hexapdf/version'
 require 'hexapdf/document'
@@ -125,6 +126,7 @@ module HexaPDF
         add_command(HexaPDF::CLI::Form.new)
         add_command(HexaPDF::CLI::Fonts.new)
         add_command(HexaPDF::CLI::Usage.new)
+        add_command(HexaPDF::CLI::DebugInfo.new)
         add_command(CmdParse::HelpCommand.new)
         version_command = CmdParse::VersionCommand.new(add_switches: false)
         add_command(version_command)

data/lib/hexapdf/dictionary.rb CHANGED Viewed

@@ -301,7 +301,13 @@ module HexaPDF
             yield(msg, true)
             self[name] = obj.intern
           else
-            yield(msg, false)
+            yield(msg, !field.required? || field.default?)
+            if field.required? && field.default?
+              self[name] = obj = field.default
+            else
+              delete(name)
+              next
+            end
           end
         end

data/lib/hexapdf/digital_signature/cms_handler.rb CHANGED Viewed

@@ -49,7 +49,11 @@ module HexaPDF
       # Creates a new signature handler for the given signature dictionary.
       def initialize(signature_dict)
         super
-        @pkcs7 = OpenSSL::PKCS7.new(signature_dict.contents)
+        begin
+          @pkcs7 = OpenSSL::PKCS7.new(signature_dict.contents)
+        rescue
+          raise HexaPDF::Error, "Signature contents is invalid"
+        end
       end
       # Returns the common name of the signer.

data/lib/hexapdf/digital_signature/signing/timestamp_handler.rb CHANGED Viewed

@@ -53,8 +53,8 @@ module HexaPDF
       # == Usage
       #
       # It is necessary to provide at least the URL of the timestamp authority server (TSA) via
-      # #tsa_url, everything else is optional and uses default values. The TSA server must not use
-      # authentication to be usable.
+      # #tsa_url, everything else is optional and uses default values. The TSA server can optionally
+      # use HTTP basic authentication.
       #
       # Example:
       #
@@ -66,6 +66,18 @@ module HexaPDF
         # This value is required.
         attr_accessor :tsa_url
+        # The username for basic authentication to the TSA server.
+        #
+        # If the username is not set, no basic authentication is done.
+        #
+        # See: #tsa_password
+        attr_accessor :tsa_username
+        # The password for basic authentication to the TSA server.
+        #
+        # See: #tsa_username
+        attr_accessor :tsa_password
         # The hash algorithm to use for timestamping. Defaults to SHA512.
         attr_accessor :tsa_hash_algorithm
@@ -127,8 +139,14 @@ module HexaPDF
           req.message_imprint = digest.digest
           req.policy_id = tsa_policy_id if tsa_policy_id
-          http_response = Net::HTTP.post(URI(tsa_url), req.to_der,
-                                         'content-type' => 'application/timestamp-query')
+          url = URI(tsa_url)
+          http_request = Net::HTTP::Post.new(url, 'Content-Type' => 'application/timestamp-query')
+          http_request.body = req.to_der
+          http_request.basic_auth(tsa_username, tsa_password) if tsa_username
+          http_response = Net::HTTP.start(url.hostname, url.port, use_ssl: (url.scheme == 'https')) do |http|
+            http.request(http_request)
+          end
           if http_response.kind_of?(Net::HTTPOK)
             response = OpenSSL::Timestamp::Response.new(http_response.body)
             if response.status == 0
@@ -136,6 +154,8 @@ module HexaPDF
             else
               raise HexaPDF::Error, "Timestamp token could not be created: #{response.failure_info}"
             end
+          elsif http_response.kind_of?(Net::HTTPUnauthorized)
+            raise HexaPDF::Error, "Basic authentication to the server failed: #{http_response.body}"
           else
             raise HexaPDF::Error, "Invalid TSA server response: #{http_response.body}"
           end

data/lib/hexapdf/encryption/security_handler.rb CHANGED Viewed

@@ -363,7 +363,9 @@ module HexaPDF
             raise(HexaPDF::UnsupportedEncryptionError,
                   "Invalid key length #{key_length} specified")
           end
-        dict[:Length] = key_length if dict[:V] == 4 || dict[:V] == 2
+        # /Length should only be set for V=2 as per the spec. However, software like Adobe Reader
+        # fails if this is not set for V=5 or V=4.
+        dict[:Length] = key_length if dict[:V] == 5 || dict[:V] == 4 || dict[:V] == 2
         if ![:aes, :arc4].include?(algorithm)
           raise(HexaPDF::UnsupportedEncryptionError,

data/lib/hexapdf/font/cmap.rb CHANGED Viewed

@@ -143,10 +143,13 @@ module HexaPDF
       # An error is raised if the string contains invalid bytes.
       def read_codes(string)
         codes = []
-        bytes = string.each_byte
+        bytes = string.bytes
+        length = bytes.length
+        i = 0
-        loop do
-          byte = bytes.next
+        while i < length
+          byte = bytes[i]
+          i += 1
           code = 0
           found = @codespace_ranges.any? do |first_byte_range, rest_ranges|
@@ -154,9 +157,10 @@ module HexaPDF
             code = (code << 8) + byte
             valid = rest_ranges.all? do |range|
-              begin
-                byte = bytes.next
-              rescue StopIteration
+              if i < length
+                byte = bytes[i]
+                i += 1
+              else
                 raise HexaPDF::Error, "Missing bytes while reading codes via CMap"
               end
               code = (code << 8) + byte

data/lib/hexapdf/parser.rb CHANGED Viewed

@@ -112,8 +112,18 @@ module HexaPDF
         end
       if xref_entry.oid != 0 && (oid != xref_entry.oid || gen != xref_entry.gen)
-        raise_malformed("The oid,gen (#{oid},#{gen}) values of the indirect object don't match " \
-                        "the values (#{xref_entry.oid},#{xref_entry.gen}) from the xref")
+        msg = "The oid,gen (#{oid},#{gen}) values of the indirect object don't match " \
+              "the values (#{xref_entry.oid},#{xref_entry.gen}) from the xref"
+        # Some invalid PDFs contain entries where the generation number in the xref is different
+        # from the one found in the indirect object. If the file were reconstructed the generation
+        # number from the indirect object itself would be used.
+        # To gracefully handle such invalid PDFs they need to have a single revision.
+        # The other code part that handles this is in Revision#object.
+        if oid == xref_entry.oid && @document.revisions.count == 1
+          maybe_raise(msg, pos: xref_entry.pos)
+        else
+          raise_malformed(msg)
+        end
       end
       if obj.kind_of?(Reference)
@@ -209,9 +219,24 @@ module HexaPDF
         tok = @tokenizer.next_token
         object[:Length] = length
+        if object.key?(:Filter)
+          begin
+            object[:Filter] = @document.unwrap(object[:Filter])
+          rescue HexaPDF::Error
+            maybe_raise("Invalid /Filter entry for stream", pos: @tokenizer.pos)
+            object.delete(:Filter)
+          end
+        end
+        if object.key?(:DecodeParms)
+          begin
+            object[:DecodeParms] = @document.unwrap(object[:DecodeParms])
+          rescue HexaPDF::Error
+            maybe_raise("Invalid /DecodeParms entry for stream", pos: @tokenizer.pos)
+            object.delete(:DecodeParms)
+          end
+        end
         stream = StreamData.new(@tokenizer.io, offset: pos, length: length,
-                                filter: @document.unwrap(object[:Filter]),
-                                decode_parms: @document.unwrap(object[:DecodeParms]))
+                                filter: object[:Filter], decode_parms: object[:DecodeParms])
       end
       unless tok.kind_of?(Tokenizer::Token) && tok == 'endobj'

data/lib/hexapdf/revision.rb CHANGED Viewed

@@ -128,6 +128,11 @@ module HexaPDF
         @objects[oid, gen]
       elsif (xref_entry = @xref_section[oid, gen])
         load_object(xref_entry)
+      elsif (xref_entry = @xref_section[oid]) && (obj = load_object(xref_entry))&.gen == gen
+        # This branch handles invalid PDFs with a single revision containing xref entries where the
+        # gen doesn't match the gen of the indirect object. Also see the special handling in
+        # Parser#load_object.
+        obj
       else
         nil
       end
@@ -219,8 +224,7 @@ module HexaPDF
         seen = {}
         @objects.each {|oid, _gen, data| seen[oid] = true; yield(data) }
         @xref_section.each do |oid, _gen, data|
-          next if seen.key?(oid)
-          yield(@objects[oid] || load_object(data))
+          yield(@objects[oid] || load_object(data)) unless seen.key?(oid)
         end
         @all_objects_loaded = true
       end

data/lib/hexapdf/type/acro_form/field.rb CHANGED Viewed

@@ -291,7 +291,10 @@ module HexaPDF
           if embedded_widget?
             yield(document.wrap(self))
           elsif terminal_field?
-            self[:Kids]&.each {|kid| yield(document.wrap(kid)) }
+            self[:Kids]&.each do |kid|
+              kid = document.wrap(kid)
+              yield(kid) if kid.type == :Annot && kid[:Subtype] == :Widget
+            end
           end
           unless direct_only

data/lib/hexapdf/type/annotation.rb CHANGED Viewed

@@ -243,7 +243,7 @@ module HexaPDF
       # The appearance state in /AS or the one provided via +state_name+ is taken into account if
       # necessary.
       def appearance(type: :normal, state_name: self[:AS])
-        entry = appearance_dict&.send("#{type}_appearance")
+        entry = appearance_dict&.send("#{type}_appearance") rescue nil
         if entry.kind_of?(HexaPDF::Dictionary) && !entry.kind_of?(HexaPDF::Stream)
           entry = entry[state_name]
         end

data/lib/hexapdf/version.rb CHANGED Viewed

@@ -37,6 +37,6 @@
 module HexaPDF
   # The version of HexaPDF.
-  VERSION = '1.4.1'
+  VERSION = '1.6.0'
 end

data/test/hexapdf/digital_signature/common.rb CHANGED Viewed

@@ -112,7 +112,12 @@ module HexaPDF
         @tsa_server.mount_proc('/') do |request, response|
           @tsr = OpenSSL::Timestamp::Request.new(request.body)
           case @tsr.policy_id || '1.2.3.4.0'
-          when '1.2.3.4.0', '1.2.3.4.2'
+          when '1.2.3.4.0', '1.2.3.4.2', '1.2.3.4.3'
+            if @tsr.policy_id == '1.2.3.4.3'
+              WEBrick::HTTPAuth.basic_auth(request, response, 'HexaPDF Auth') do |username, password|
+                username == 'hexatest' && password == 'hexapwd'
+              end
+            end
             fac = OpenSSL::Timestamp::Factory.new
             fac.gen_time = Time.now
             fac.serial_number = 1

data/test/hexapdf/digital_signature/signing/test_timestamp_handler.rb CHANGED Viewed

@@ -67,6 +67,18 @@ describe HexaPDF::DigitalSignature::Signing::TimestampHandler do
       assert_equal("1.2.3.4.2", policy_id)
     end
+    it "allows using basic authentication on the server" do
+      @handler.tsa_policy_id = '1.2.3.4.3'
+      @handler.tsa_username = 'hexatest'
+      @handler.tsa_password = 'invalid'
+      msg = assert_raises(HexaPDF::Error) { @handler.sign(@data, @range) }
+      assert_match(/Basic authentication/, msg.message)
+      @handler.tsa_password = 'hexapwd'
+      token = OpenSSL::PKCS7.new(@handler.sign(@data, @range))
+      assert_equal(CERTIFICATES.ca_certificate.subject, token.signers[0].issuer)
+    end
     it "returns the serialized timestamp token" do
       token = OpenSSL::PKCS7.new(@handler.sign(@data, @range))
       assert_equal(CERTIFICATES.ca_certificate.subject, token.signers[0].issuer)

data/test/hexapdf/digital_signature/test_cms_handler.rb CHANGED Viewed

@@ -17,6 +17,12 @@ describe HexaPDF::DigitalSignature::CMSHandler do
     @handler = HexaPDF::DigitalSignature::CMSHandler.new(@dict)
   end
+  it "fails with an appropriate error if the the signature contents is invalid" do
+    @dict.contents = :Unknown
+    msg = assert_raises(HexaPDF::Error) { HexaPDF::DigitalSignature::CMSHandler.new(@dict) }
+    assert_match(/contents is invalid/, msg.message)
+  end
   it "returns the signer name" do
     assert_equal("RSA signer", @handler.signer_name)
   end

data/test/hexapdf/encryption/test_security_handler.rb CHANGED Viewed

@@ -129,16 +129,18 @@ describe HexaPDF::Encryption::SecurityHandler do
     end
     it "sets the correct /Length value for the given key length" do
-      [[40, nil], [48, 48], [128, 128], [256, nil]].each do |key_length, result|
-        algorithm = (key_length == 256 ? :aes : :arc4)
-        @handler.set_up_encryption(key_length: key_length, algorithm: algorithm)
-        assert(result == @handler.dict[:Length])
+      [[40, nil], [48, 48], [128, 128]].each do |key_length, result|
+        @handler.set_up_encryption(key_length: key_length, algorithm: :arc4)
+        result.nil? ? assert_nil(@handler.dict[:Length]) : assert_equal(result, @handler.dict[:Length])
       end
-      # Work-around buggy software
+      # Work-around for buggy software needing the /Length key
       @handler.set_up_encryption(key_length: 128, algorithm: :aes)
       assert_equal(4, @handler.dict[:V])
       assert_equal(128, @handler.dict[:Length])
+      @handler.set_up_encryption(key_length: 256, algorithm: :aes)
+      assert_equal(5, @handler.dict[:V])
+      assert_equal(256, @handler.dict[:Length])
     end
     it "calls the prepare_encryption method" do

data/test/hexapdf/test_dictionary.rb CHANGED Viewed

@@ -251,8 +251,23 @@ describe HexaPDF::Dictionary do
       refute(@obj.validate(auto_correct: false))
       assert(@obj.validate(auto_correct: true))
       @obj.value[:NameField] = "string"
+      refute(@obj.validate(auto_correct: false))
       assert(@obj.validate(auto_correct: true))
+      @test_class.define_field(:RequiredDefault, type: String, required: true, default: 'str')
+      @obj.value[:RequiredDefault] = 20
+      refute(@obj.validate(auto_correct: false))
+      assert_equal(20, @obj.value[:RequiredDefault])
       assert(@obj.validate(auto_correct: true))
+      assert_equal("str", @obj.value[:RequiredDefault])
+      @obj.value[:AllowedValues] = '20'
+      assert(@obj.validate(auto_correct: true))
+      refute(@obj.key?(:AllowedValues))
+      @obj.value[:Inherited] = 20
+      refute(@obj.validate(auto_correct: true))
+      refute(@obj.key?(:Inherited))
     end
     it "checks whether the value is an allowed one" do

data/test/hexapdf/test_document.rb CHANGED Viewed

@@ -347,7 +347,7 @@ describe HexaPDF::Document do
     it "validates the trailer object" do
       @doc.trailer[:ID] = :Symbol
-      refute(@doc.validate {|_, _, obj| assert_same(@doc.trailer, obj) })
+      assert(@doc.validate {|_a, _b, obj| assert_same(@doc.trailer, obj) })
     end
     it "validates only loaded objects" do
@@ -391,7 +391,7 @@ describe HexaPDF::Document do
     end
     it "fails if the document is not valid" do
-      @doc.trailer[:Size] = :Symbol
+      @doc.catalog[:PageLayout] = :invalid_value
       assert_raises(HexaPDF::Error) { @doc.write(StringIO.new(''.b)) }
     end

data/test/hexapdf/test_parser.rb CHANGED Viewed

@@ -10,6 +10,7 @@ describe HexaPDF::Parser do
     @document = HexaPDF::Document.new
     @document.config['parser.try_xref_reconstruction'] = false
     @document.add(@document.wrap(10, oid: 1, gen: 0))
+    @document.add(@document.wrap({Recurse: HexaPDF::Reference.new(3)}, oid: 3))
     create_parser(+<<~EOF)
       %PDF-1.7
@@ -173,6 +174,18 @@ describe HexaPDF::Parser do
       assert_equal({Length: 4}, object)
     end
+    it "recovers in case of an invalid /Filter leading to indirect object recursion" do
+      create_parser("1 0 obj<</Length 1/Filter 3 0 R>>stream\n1\nendstream endobj")
+      object, * = @parser.parse_indirect_object
+      assert_equal({Length: 1}, object)
+    end
+    it "recovers in case of an invalid /DecodeParms leading to indirect object recursion" do
+      create_parser("1 0 obj<</Length 1/DecodeParms 3 0 R>>stream\n1\nendstream endobj")
+      object, * = @parser.parse_indirect_object
+      assert_equal({Length: 1}, object)
+    end
     it "fails if the oid, gen or 'obj' keyword is invalid" do
       create_parser("a 0 obj\n5\nendobj")
       exp = assert_raises(HexaPDF::MalformedPDFError) { @parser.parse_indirect_object }
@@ -267,6 +280,18 @@ describe HexaPDF::Parser do
         exp = assert_raises(HexaPDF::MalformedPDFError) { @parser.parse_indirect_object(0) }
         assert_match(/keyword endobj/, exp.message)
       end
+      it "fails if an invalid /Filter leads to indirect object recursion" do
+        create_parser("1 0 obj<</Length 1/Filter 3 0 R>>stream\n1\nendstream endobj")
+        exp = assert_raises(HexaPDF::MalformedPDFError) { @parser.parse_indirect_object }
+        assert_match(/Invalid \/Filter/, exp.message)
+      end
+      it "fails if an invalid /DecodeParms leads to indirect object recursion" do
+        create_parser("1 0 obj<</Length 1/DecodeParms 3 0 R>>stream\n1\nendstream endobj")
+        exp = assert_raises(HexaPDF::MalformedPDFError) { @parser.parse_indirect_object }
+        assert_match(/Invalid \/DecodeParms/, exp.message)
+      end
     end
   end
@@ -315,14 +340,32 @@ describe HexaPDF::Parser do
       assert_equal(1, obj.oid)
     end
+    it "handles the case when generation numbers don't match with a single revision" do
+      @entry.gen = 2
+      obj = @parser.load_object(@entry)
+      assert_equal(2, obj.oid)
+      assert_equal(5, obj[0])
+    end
     describe "with strict parsing" do
-      it "raises an error if an indirect object has an offset of 0" do
+      before do
         @document.config['parser.on_correctable_error'] = proc { true }
+      end
+      it "raises an error if an indirect object has an offset of 0" do
         exp = assert_raises(HexaPDF::MalformedPDFError) do
           @parser.load_object(HexaPDF::XRefSection.in_use_entry(2, 0, 0))
         end
         assert_match(/has offset 0/, exp.message)
       end
+      it "fails if the generation numbers don't match with a single revision" do
+        exp = assert_raises(HexaPDF::MalformedPDFError) do
+          @entry.gen = 2
+          @parser.load_object(@entry)
+        end
+        assert_match(/oid,gen.*don't match/, exp.message)
+      end
     end
     it "fails if another object is found instead of an object stream" do
@@ -342,9 +385,18 @@ describe HexaPDF::Parser do
       assert_match(/invalid cross-reference type/i, exp.message)
     end
-    it "fails if the object/generation numbers don't match" do
+    it "fails if the object numbers don't match" do
+      exp = assert_raises(HexaPDF::MalformedPDFError) do
+        @entry.oid = 5
+        @parser.load_object(@entry)
+      end
+      assert_match(/oid,gen.*don't match/, exp.message)
+    end
+    it "fails if the generation numbers don't match for multiple revisions" do
+      @document.revisions.add
       exp = assert_raises(HexaPDF::MalformedPDFError) do
-        @entry.gen = 2
+        @entry.gen = 5
         @parser.load_object(@entry)
       end
       assert_match(/oid,gen.*don't match/, exp.message)

data/test/hexapdf/test_revision.rb CHANGED Viewed

@@ -17,6 +17,7 @@ describe HexaPDF::Revision do
     @xref_section.add_in_use_entry(5, 0, 1000)
     @xref_section.add_in_use_entry(6, 0, 5000)
     @xref_section.add_in_use_entry(7, 0, 5000)
+    @xref_section.add_in_use_entry(8, 2, 5000)
     @obj = HexaPDF::Object.new(:val, oid: 1, gen: 0)
     @ref = HexaPDF::Reference.new(1, 0)
@@ -30,6 +31,7 @@ describe HexaPDF::Revision do
         when 5 then HexaPDF::Dictionary.new({Type: :ObjStm}, oid: entry.oid, gen: entry.gen)
         when 7 then HexaPDF::Type::Catalog.new({Type: :Catalog}, oid: entry.oid, gen: entry.gen,
                                               document: self)
+        when 8 then HexaPDF::Object.new(:DifferentGen, oid: entry.oid, gen: 0)
         when 6 then HexaPDF::Dictionary.new({Array: HexaPDF::PDFArray.new([1, 2])},
                                             oid: entry.oid, gen: entry.gen)
         else HexaPDF::Object.new(:Test, oid: entry.oid, gen: entry.gen)
@@ -50,10 +52,10 @@ describe HexaPDF::Revision do
   end
   it "returns the next free object number" do
-    assert_equal(8, @rev.next_free_oid)
-    @obj.oid = 8
-    @rev.add(@obj)
     assert_equal(9, @rev.next_free_oid)
+    @obj.oid = 9
+    @rev.add(@obj)
+    assert_equal(10, @rev.next_free_oid)
   end
   describe "add" do
@@ -113,6 +115,12 @@ describe HexaPDF::Revision do
       refute_nil(obj)
     end
+    it "loads an object that is defined in the cross-reference section with an invalid generation number" do
+      obj = @rev.object(HexaPDF::Reference.new(8, 0))
+      assert_equal(0, obj.gen)
+      assert_equal(:DifferentGen, obj.value)
+    end
     it "loads free entries in the cross-reference section as special PDF null objects" do
       obj = @rev.object(HexaPDF::Reference.new(3, 0))
       assert_nil(obj.value)
@@ -172,7 +180,20 @@ describe HexaPDF::Revision do
   describe "object iteration" do
     it "iterates over all objects via each" do
       @rev.add(@obj)
-      assert_equal([@obj, *(2..7).map {|i| @rev.object(i) }], @rev.each.to_a)
+      assert_equal([@obj, *(2..8).map {|i| @rev.object(i) }], @rev.each.to_a)
+    end
+    it "ensures no object is loaded multiple times" do
+      obj_2_data = nil
+      @rev.add(@obj) # ensures this is yielded first
+      @rev.each do |obj|
+        if obj == @obj
+          obj_2_data = @rev.object(2).data
+        elsif obj.oid == 2
+          assert_same(obj_2_data, obj.data)
+          break
+        end
+      end
     end
     it "iterates only over loaded objects" do
@@ -216,8 +237,8 @@ describe HexaPDF::Revision do
     end
     it "handles object and xref streams that were added appropriately depending on the 'all' arg" do
-      xref = @rev.add(HexaPDF::Dictionary.new({Type: :XRef}, oid: 8))
-      objstm = @rev.add(HexaPDF::Dictionary.new({Type: :ObjStm}, oid: 9))
+      xref = @rev.add(HexaPDF::Dictionary.new({Type: :XRef}, oid: 20))
+      objstm = @rev.add(HexaPDF::Dictionary.new({Type: :ObjStm}, oid: 21))
       assert_equal([], @rev.each_modified_object.to_a)
       assert_equal([xref, objstm], @rev.each_modified_object(all: true).to_a)
     end

data/test/hexapdf/type/acro_form/test_field.rb CHANGED Viewed

@@ -147,6 +147,11 @@ describe HexaPDF::Type::AcroForm::Field do
     it "yields nothing if no widgets are defined" do
       assert_equal([], @field.each_widget.to_a)
     end
+    it "ignores entries in the /Kids array that are not widgets" do
+      @field[:Kids] = [{Subtype: :Widget, Rect: [0, 0, 0, 0], X: 1}, {FT: :Tx, Kids: []}]
+      assert_equal(1, @field.each_widget.to_a.size)
+    end
   end
   describe "create_widget" do

data/test/hexapdf/type/test_annotation.rb CHANGED Viewed

@@ -67,6 +67,9 @@ describe HexaPDF::Type::Annotation do
   it "returns the appearance stream of the given type" do
     assert_nil(@annot.appearance)
+    @annot[:AP] = 'some invalid type'
+    assert_nil(@annot.appearance)
     @annot[:AP] = {N: {}}
     assert_nil(@annot.appearance)

data/test/test_helper.rb CHANGED Viewed

@@ -11,6 +11,12 @@ rescue LoadError
 end
 gem 'minitest'
+begin
+  gem 'minitest-mock'
+  require 'minitest/mock'
+rescue Gem::MissingSpecError
+  # Assume Minitest < 6 is in use for older Rubies
+end
 gem 'strscan'
 require 'minitest/autorun'
 require 'fiber'

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: hexapdf
 version: !ruby/object:Gem::Version
-  version: 1.4.1
+  version: 1.6.0
 platform: ruby
 authors:
 - Thomas Leitner
 bindir: bin
 cert_chain: []
-date: 2025-09-23 00:00:00.000000000 Z
+date: 1980-01-02 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: cmdparse
@@ -97,14 +97,28 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '5.16'
+        version: '6.0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '5.16'
+        version: '6.0'
+- !ruby/object:Gem::Dependency
+  name: minitest-mock
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.27'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.27'
 - !ruby/object:Gem::Dependency
   name: reline
   requirement: !ruby/object:Gem::Requirement
@@ -327,6 +341,7 @@ files:
 - lib/hexapdf/cli.rb
 - lib/hexapdf/cli/batch.rb
 - lib/hexapdf/cli/command.rb
+- lib/hexapdf/cli/debug_info.rb
 - lib/hexapdf/cli/files.rb
 - lib/hexapdf/cli/fonts.rb
 - lib/hexapdf/cli/form.rb
@@ -864,7 +879,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.6.2
+rubygems_version: 4.0.3
 specification_version: 4
 summary: HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby
 test_files: []