RubyGems - zip_kit - Versions diffs - 6.3.0 → 6.3.2 - Mend

zip_kit 6.3.0 → 6.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

checksums.yaml +4 -4
data/.github/workflows/ci.yml +31 -8
data/CHANGELOG.md +9 -0
data/IMPLEMENTATION_DETAILS.md +9 -35
data/README.md +25 -11
data/RUBYZIP_DIFFERENCES.md +1 -4
data/lib/zip_kit/block_deflate.rb +8 -8
data/lib/zip_kit/file_reader.rb +17 -17
data/lib/zip_kit/output_enumerator.rb +11 -0
data/lib/zip_kit/rails_streaming.rb +1 -1
data/lib/zip_kit/railtie.rb +3 -1
data/lib/zip_kit/remote_io.rb +4 -4
data/lib/zip_kit/size_estimator.rb +14 -8
data/lib/zip_kit/stream_crc32.rb +5 -5
data/lib/zip_kit/streamer/heuristic.rb +8 -0
data/lib/zip_kit/streamer.rb +23 -5
data/lib/zip_kit/version.rb +1 -1
data/lib/zip_kit/write_shovel.rb +1 -1
data/lib/zip_kit/zip_writer.rb +20 -20
data/rbi/zip_kit.rbi +75 -66
data/zip_kit.gemspec +2 -0
metadata +31 -9
data/.document +0 -5
data/.rspec +0 -1
data/bench/buffered_crc32_bench.rb +0 -109

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: dcd59b7f9d367a40895c9bac2f909f3e22ebd0a96ee737073ef5153c4d486c70
-  data.tar.gz: 05665ca021d401cad02f80c38d8965c2e5fef1a477317f7ee92246d1c54a21ae
+  metadata.gz: 2ff5f4284066004d435d36f28020e6ef2279fe0abe36c520436db3bd39d7608a
+  data.tar.gz: ac9f9c4312c632410cee6ffb98c2ec471f9de16e5260793a30df8ac28639218d
 SHA512:
-  metadata.gz: b13d15a467565ef66ce6a21ac3891e8a26b30246d538c232aca193c3fe24a24a8f3a2d844d48eaeaeeb5ccc2f92cfcaab20e4afa1a2717b89ba27f55bb4635e4
-  data.tar.gz: 29aacf905b670323861812a4aece1446917ac2f94ece22b948f5d2f77a6eb7574c39598177e7c4f33cc230fd863b4f1c02eb961c5b48e4823d40854a25368101
+  metadata.gz: 34389f0a2d38a532af341c7694fcd10c2bbfb2f819ce9d1168afd51fbfc815b8c4ef0b36929f286f3ad31ead98479b07b871ec35d1d8f5adefbf2bb2717f3f58
+  data.tar.gz: 821151643cc5adafd9fe3446412e9d9de60ff049f6bab3ca1e9a7c0dc24dbcef0688b8bc0ab1fa51d88feb85892b3707ccbf8483e41c532680c2e9ddb073e823

data/.github/workflows/ci.yml CHANGED Viewed

@@ -7,23 +7,46 @@ env:
   BUNDLE_PATH: vendor/bundle
 jobs:
-  test:
-    name: Tests and Lint
+  test_baseline_ruby:
+    name: "Tests (Ruby 2.6 baseline)"
     runs-on: ubuntu-22.04
-    strategy:
-      matrix:
-        ruby:
-          - '2.6'
-          - '3.2'
     steps:
       - name: Checkout
         uses: actions/checkout@v4
       - name: Setup Ruby
         uses: ruby/setup-ruby@v1
         with:
-          ruby-version: ${{ matrix.ruby }}
+          ruby-version: '2.6'
           bundler-cache: true
       - name: "Tests"
         run: bundle exec rspec --backtrace --fail-fast
+  test_newest_ruby:
+    name: "Tests (Ruby 3.4 with frozen string literals)"
+    runs-on: ubuntu-22.04
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Setup Ruby
+        uses: ruby/setup-ruby@v1
+        with:
+          ruby-version: '3.4.1'
+          bundler-cache: true
+      - name: "Tests" # Make the test suite hard-crash on frozen string literal violations
+        env:
+          RUBYOPT: "--enable=frozen-string-literal --debug=frozen-string-literal"
+        run: "bundle exec rspec --backtrace --fail-fast"
+  lint_baseline_ruby: # We need to use syntax appropriate for the minimum supported Ruby version
+    name: Lint (Ruby 2.6 syntax)
+    runs-on: ubuntu-22.04
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Setup Ruby
+        uses: ruby/setup-ruby@v1
+        with:
+          ruby-version: '2.6'
+          bundler-cache: true
       - name: "Lint"
         run: bundle exec rake standard

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,12 @@
+## 6.3.2
+* Make sure `rollback!` correctly works with `write_file` and the original exception gets re-raised from `write_file` if
+  closing the current entry happens in `Writable#close`
+## 6.3.1
+* Include `RailsStreaming` in a Rails loader callback, so that ActionController does not need to be in the namespace.
 ## 6.3.0
 * Include `RailsStreaming` automatically via a Railtie. It is not really necessary to force people to manage it manually.

data/IMPLEMENTATION_DETAILS.md CHANGED Viewed

@@ -5,6 +5,7 @@ The ZipKit streaming implementation is designed around the following requirement
 * Only ahead-writes (no IO seek or rewind)
 * Automatic switching to Zip64 as the files get written (no IO seeks), but not requiring Zip64 support if the archive can do without
 * Make use of the fact that CRC32 checksums and the sizes of the files (compressed _and_ uncompressed) are known upfront
+* Make it possible to output "sparse" ZIP archives (manifests that can be resolved into a ZIP via edge includes)
 It strives to be compatible with the following unzip programs _at the minimum:_
@@ -14,9 +15,6 @@ It strives to be compatible with the following unzip programs _at the minimum:_
 * Windows 7 - 7Zip 9.20
 Below is the list of _specific_ decisions taken when writing the implementation, with an explanation for each.
-We specifically _omit_ a number of things that we could do, but that are not necessary to satisfy our objectives.
-The omissions are _intentional_ since we do not want to have things of which we _assume_ they work, or have things
-that work only for one obscure unarchiver in one obscure case (like WinRAR with chinese filenames).
 ## Data descriptors (postfix CRC32/file sizes)
@@ -53,38 +51,14 @@ field, any other extra fields should come after.
 If a diacritic-containing character (such as å) does fit into the DOS-437
 codepage, it should be encodable as such. This would, in theory, let older Windows tools
-decode the filename correctly. However, this kills the filename decoding for the OSX builtin
-archive utility (it assumes the filename to be UTF-8, regardless). So if we allow filenames
-to be encoded in DOS-437, we _potentially_ have support in Windows but we upset everyone on Mac.
-If we just use UTF-8 and set the right EFS bit in general purpose flags, we upset Windows users
-because most of the Windows unarchive tools (at least the builtin ones) do not give a flying eff
-about the EFS support bit being set.
-Additionally, if we use Unarchiver on OSX (which is our recommended unpacker for large files),
-it will (very rightfully) ask us how we should decode each filename that does not have the EFS bit,
-but does contain something non-ASCII-decodable. This is horrible UX for users.
-So, basically, we have 2 choices, for filenames containing diacritics (for bona-fide UTF-8 you do not
-even get those choices, you _have_ to use UTF-8):
-* Make life easier for Windows users by setting stuff to DOS, not care about the standard _and_ make
-  most of Mac users upset
-* Make life easy for Mac users and conform to the standard, and tell Windows users to get a _decent_
-  ZIP unarchiving tool.
-We are going with option 2, and this is well-thought-out. Trust me. If you want the crazytown
-filename encoding scheme that is described here http://stackoverflow.com/questions/13261347
-you can try this:
-   [Encoding::CP437, Encoding::ISO_8859_1, Encoding::UTF_8]
-While this could work, we found it to be broken in practice as the decoding of the filename
-also depends on the system locale.
-Additionally, the tests with the unarchivers we _do_ support have shown that including the InfoZIP
-extra field does not actually help any of them recognize the file name correctly. And the use of
-those fields for the UTF-8 filename, per spec, tells us we should not set the EFS bit - which ruins
-the unarchiving for all other solutions. As any other, this decision may be changed in the future.
+decode the filename correctly. However, this only works under the following circumstances:
+* All the filenames in the archive are within the same "super-ASCII" encoding
+* The Windows locale on the computer opening the archive is set to the same locale as the filename in the archive
+A better approach is to use the EFS flag, which we enable when a filename does not encode cleanly
+into base ASCII. The extended filename extra field did not work well for us - and it does not
+combine correctly with the EFS flag.
 There are some interesting notes about the Info-ZIP/EFS combination here
 https://commons.apache.org/proper/commons-compress/zip.html

data/README.md CHANGED Viewed

@@ -5,23 +5,38 @@
 Allows streaming, non-rewinding ZIP file output from Ruby.
-`zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
-was inspired by [zipline](https://github.com/fringd/zipline). I am grateful to WeTransfer for allowing me
-to develop zip_tricks and for sharing it with the community.
+> [!IMPORTANT]
+> `zip_kit` is a successor to and continuation of [zip_tricks.](https://github.com/WeTransfer/zip_tricks)
+> I am grateful to WeTransfer for allowing me to develop zip_tricks and for sharing it with the community.
 Allows you to write a ZIP archive out to a `File`, `Socket`, `String` or `Array` without having to rewind it at any
 point. Usable for creating very large ZIP archives for immediate sending out to clients, or for writing
 large ZIP archives without memory inflation.
-The original gem (zip_tricks) handled all the zipping needs (millions of ZIP files generated per day),
-for WeTransfer, it is widely compatible with a large number of unarchiving end-user applications.
+The gem handled all the zipping needs for WeTransfer for half a decade, with hundreds of millions
+of correct ZIP files generated. It is compatible with most end-user applications for opening archives.
+The files output with zip_kit will be valid [OCF containers](https://www.w3.org/TR/epub-33/#sec-container-zip),
+the library can be used to generate JAR files, EPUBs, OpenOffice/Office documents etc.
 ## How does it work? How is it different from Rubyzip?
+zip_kit outputs the metadata of the ZIP file as it becomes available. Same for the content of the ZIP
+entries. This allows nearly-unbuffered, streaming output. When reading ZIP files, zip_kit only reads
+the metadata and does so in an accelerated, efficient way - permitting ZIP unarchiving directly from
+a resource on HTTP (provided that the server supports HTTP ranges).
 Check out [the implementation details](IMPLEMENTATION_DETAILS.md) on the design of the library, and
 we have a separate [reference](RUBYZIP_DIFFERENCES.md) on why you might want to use ZipKit over
 Rubyzip and vice versa.
+## Migrating from zip_tricks
+If you want to migrate your code from zip_tricks to zip_kit, all you need to do is a blanket replacement in your code.
+Swap out the `ZipTricks` constant for `ZipKit` and you should be in business. All of the API available in ZipTricks 5.x
+still works as of ZipKit 6.x and will stay working. If something in your project still depends on zip_tricks you can use
+both gems inside of the same "apex" project - there will be no conflicts.
 ## Requirements
 Ruby 2.6+ syntax support is required, as well as a a working zlib (all available to jRuby as well).
@@ -60,9 +75,8 @@ If you want some more conveniences you can also use [zipline](https://github.com
 will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
 via HTTP.
-`RailsStreaming` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
-and will stream without it. See {ZipKit::RailsStreaming#zip_kit_stream} for more details on this. You can use it
-together with `Live` just fine if you need to.
+`zip_kit_stream` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
+and will stream without it. It will work inside `Live` controllers just fine though.
 ## Writing into streaming destinations
@@ -128,10 +142,10 @@ output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in y
 ```ruby
 ZipKit::Streamer.open($stdout) do |zip|
-  zip.write_file('mov.mp4.txt') do |sink|
+  zip.write_file('mov.mp4') do |sink| # Will use "stored" mode
     File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
   end
-  zip.write_file('long-novel.txt') do |sink|
+  zip.write_file('long-novel.txt') do |sink| # Will use "deflated" mode
     File.open('novel.txt', 'rb'){|source| IO.copy_stream(source, sink) }
   end
 end
@@ -166,7 +180,7 @@ end
 Sending a file with data descriptors is not always desirable - you don't really know how large your ZIP is going to be.
 If you want to present your users with proper download progress, you would need to set a `Content-Length` header - and
 know ahead of time how large your download is going to be. This can be done with ZipKit, provided you know how large
-the compressed versions of your file are going to be. Use the {ZipKit::SizeEstimator} to do the pre-calculation - it
+the compressed versions of your file are going to be. Use the `ZipKit::SizeEstimator` to do the pre-calculation - it
 is not going to produce any large amounts of output, and will give you a to-the-byte value for your future archive:
 ```ruby

data/RUBYZIP_DIFFERENCES.md CHANGED Viewed

@@ -16,10 +16,6 @@ differences in supported features which may be important for you when choosing.
   and Zip64, and is economical enough to enable "remote uncapping" where pieces of a ZIP file get read over HTTP to reconstruct
   the archive structure. Actual reading can then be done on a per-entry basis. Rubyzip reads entry data from local entries, which
   is error prone and much less economical than using the central directory
-* ZipKit deliberately _does not_ allow you to crawl directories to add to an archive, as this has been used for security exploits
-  in Rubyzip.
-* ZipKit deliberately _does not_ allow you to extract a ZIP archive directly to the filesystem, as this has been used for security
-  exploits in Rubyzip.
 * When writing, ZipKit applies careful buffering to speed up CRC32 calculations. Rubyzip combines CRC32 values at every write, which
   can be slow if there are many small writes.
 * ZipKit comes with a Rails helper and a Rack-compatible response body for facilitating streaming. Rubyzip has no Rails integration
@@ -29,6 +25,7 @@ differences in supported features which may be important for you when choosing.
 * ZipKit requires components using autoloading, which means that your application will likely boot faster as you will almost never
   need all of the features in one codebase. Rubyzip requires its components eagerly.
 * ZipKit comes with exhaustive YARD documentation and `.rbi` typedefs for [Sorbet/Tapioca](https://sorbet.org/blog/2022/07/27/srb-tapioca)
+* ZipKit allows you to compose "sparse" ZIP files where the contents of the files inside the archive comes from an external source, and does not have to be passed through the library (or be turned into Ruby strings), which enables interesting use cases such as download proxies with random access and resume.
 ## What Rubyzip supports and ZipKit does not

data/lib/zip_kit/block_deflate.rb CHANGED Viewed

@@ -55,7 +55,7 @@ class ZipKit::BlockDeflate
   # `output_io` can also be a {ZipKit::Streamer} to expedite ops.
   #
   # @param output_io [IO] the stream to write to (should respond to `:<<`)
-  # @return [Fixnum] number of bytes written to `output_io`
+  # @return [Integer] number of bytes written to `output_io`
   def self.write_terminator(output_io)
     output_io << END_MARKER
     END_MARKER.bytesize
@@ -65,7 +65,7 @@ class ZipKit::BlockDeflate
   # The returned string can be spliced into another deflate stream.
   #
   # @param bytes [String] Bytes to compress
-  # @param level [Fixnum] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
+  # @param level [Integer] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
   # @return [String] compressed bytes
   def self.deflate_chunk(bytes, level: Zlib::DEFAULT_COMPRESSION)
     raise "Invalid Zlib compression level #{level}" unless VALID_COMPRESSIONS.include?(level)
@@ -90,9 +90,9 @@ class ZipKit::BlockDeflate
   #
   # @param input_io [IO] the stream to read from (should respond to `:read`)
   # @param output_io [IO] the stream to write to (should respond to `:<<`)
-  # @param level [Fixnum] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
-  # @param block_size [Fixnum] The block size to use (defaults to `DEFAULT_BLOCKSIZE`)
-  # @return [Fixnum] number of bytes written to `output_io`
+  # @param level [Integer] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
+  # @param block_size [Integer] The block size to use (defaults to `DEFAULT_BLOCKSIZE`)
+  # @return [Integer] number of bytes written to `output_io`
   def self.deflate_in_blocks_and_terminate(input_io,
     output_io,
     level: Zlib::DEFAULT_COMPRESSION,
@@ -110,9 +110,9 @@ class ZipKit::BlockDeflate
   #
   # @param input_io [IO] the stream to read from (should respond to `:read`)
   # @param output_io [IO] the stream to write to (should respond to `:<<`)
-  # @param level [Fixnum] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
-  # @param block_size [Fixnum] The block size to use (defaults to `DEFAULT_BLOCKSIZE`)
-  # @return [Fixnum] number of bytes written to `output_io`
+  # @param level [Integer] Zlib compression level (defaults to `Zlib::DEFAULT_COMPRESSION`)
+  # @param block_size [Integer] The block size to use (defaults to `DEFAULT_BLOCKSIZE`)
+  # @return [Integer] number of bytes written to `output_io`
   def self.deflate_in_blocks(input_io,
     output_io,
     level: Zlib::DEFAULT_COMPRESSION,

data/lib/zip_kit/file_reader.rb CHANGED Viewed

@@ -86,46 +86,46 @@ class ZipKit::FileReader
   # the Entry object used in Streamer for ZIP writing, since during writing more
   # data can be kept in memory for immediate use.
   class ZipEntry
-    # @return [Fixnum] bit-packed version signature of the program that made the archive
+    # @return [Integer] bit-packed version signature of the program that made the archive
     attr_accessor :made_by
-    # @return [Fixnum] ZIP version support needed to extract this file
+    # @return [Integer] ZIP version support needed to extract this file
     attr_accessor :version_needed_to_extract
-    # @return [Fixnum] bit-packed general purpose flags
+    # @return [Integer] bit-packed general purpose flags
     attr_accessor :gp_flags
-    # @return [Fixnum] Storage mode (0 for stored, 8 for deflate)
+    # @return [Integer] Storage mode (0 for stored, 8 for deflate)
     attr_accessor :storage_mode
-    # @return [Fixnum] the bit-packed DOS time
+    # @return [Integer] the bit-packed DOS time
     attr_accessor :dos_time
-    # @return [Fixnum] the bit-packed DOS date
+    # @return [Integer] the bit-packed DOS date
     attr_accessor :dos_date
-    # @return [Fixnum] the CRC32 checksum of this file
+    # @return [Integer] the CRC32 checksum of this file
     attr_accessor :crc32
-    # @return [Fixnum] size of compressed file data in the ZIP
+    # @return [Integer] size of compressed file data in the ZIP
     attr_accessor :compressed_size
-    # @return [Fixnum] size of the file once uncompressed
+    # @return [Integer] size of the file once uncompressed
     attr_accessor :uncompressed_size
     # @return [String] the filename
     attr_accessor :filename
-    # @return [Fixnum] disk number where this file starts
+    # @return [Integer] disk number where this file starts
     attr_accessor :disk_number_start
-    # @return [Fixnum] internal attributes of the file
+    # @return [Integer] internal attributes of the file
     attr_accessor :internal_attrs
-    # @return [Fixnum] external attributes of the file
+    # @return [Integer] external attributes of the file
     attr_accessor :external_attrs
-    # @return [Fixnum] at what offset the local file header starts
+    # @return [Integer] at what offset the local file header starts
     #        in your original IO object
     attr_accessor :local_file_header_offset
@@ -151,7 +151,7 @@ class ZipKit::FileReader
       end
     end
-    # @return [Fixnum] at what offset you should start reading
+    # @return [Integer] at what offset you should start reading
     #       for the compressed data in your original IO object
     def compressed_data_offset
       @compressed_data_offset || raise(LocalHeaderPending)
@@ -298,7 +298,7 @@ class ZipKit::FileReader
   # this offset to get the data).
   #
   # @param io[#read] an IO-ish object the ZIP file can be read from
-  # @return [Array<ZipEntry, Fixnum>] the parsed local header entry and
+  # @return [Array<ZipEntry, Integer>] the parsed local header entry and
   # the compressed data offset
   def read_local_file_header(io:)
     local_file_header_offset = io.tell
@@ -365,8 +365,8 @@ class ZipKit::FileReader
   # (read starting at this offset to get the data).
   #
   # @param io[#seek, #read] an IO-ish object the ZIP file can be read from
-  # @param local_file_header_offset[Fixnum] absolute offset (0-based) where the
-  # local file header is supposed to begin @return [Fixnum] absolute offset
+  # @param local_file_header_offset[Integer] absolute offset (0-based) where the
+  # local file header is supposed to begin @return [Integer] absolute offset
   # (0-based) of where the compressed data begins for this file within the ZIP
   def get_compressed_data_offset(io:, local_file_header_offset:)
     seek(io, local_file_header_offset)

data/lib/zip_kit/output_enumerator.rb CHANGED Viewed

@@ -112,6 +112,17 @@ class ZipKit::OutputEnumerator
   # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
   # and ePubs.
   #
+  # More value, however, is in the "technical" headers this method will provide. It will take the following steps to make sure streaming works correctly.
+  #
+  # * `Last-Modified` will be set to "now" so that the response is considered "fresh" by `Rack::ETag`. This is done so that `Rack::ETag` won't try to
+  #      calculate a lax ETag value and thus won't start buffering your response out of nowhere
+  # * `Content-Encoding` will be set to `identity`. This is so that proxies or the Rack middleware that applies compression to the response (like gzip)
+  #      is not going to try to compress your response. It also tells the receiving browsers (or downstream proxies) that they should not attempt to
+  #      open or uncompress the response before saving it or passing it onwards.
+  # * `X-Accel-Buffering` will be set to 'no` - this tells both nginx and the Google Cloud load balancer that the response should not be buffered
+  #
+  # These header values are known to get as close as possible to guaranteeing streaming on most environments where Ruby web applications may be hosted.
+  #
   # @return [Hash]
   def self.streaming_http_headers
     _headers = {

data/lib/zip_kit/rails_streaming.rb CHANGED Viewed

@@ -24,7 +24,7 @@ module ZipKit::RailsStreaming
   # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
   # @param output_enumerator_options[Hash] options that will be passed to the OutputEnumerator - these include
   #     options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
-  # @yieldparam [ZipKit::Streamer] zip the {ZipKit::Streamer} that can be written to
+  # @yieldparam zip[ZipKit::Streamer] the {ZipKit::Streamer} that can be written to
   # @return [Boolean] always returns true
   def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk)
     # We want some common headers for file sending. Rails will also set

data/lib/zip_kit/railtie.rb CHANGED Viewed

@@ -2,6 +2,8 @@
 class ZipKit::Railtie < ::Rails::Railtie
   initializer "zip_kit.install_extensions" do |app|
-    ActionController::Base.include(ZipKit::RailsStreaming)
+    ActiveSupport.on_load(:action_controller) do
+      include(ZipKit::RailsStreaming)
+    end
   end
 end

data/lib/zip_kit/remote_io.rb CHANGED Viewed

@@ -40,7 +40,7 @@ class ZipKit::RemoteIO
   # so if you are at offset 0 in the IO of size 10, doing a `read(20)`
   # will only return you 10 bytes of result, and not raise any exceptions.
   #
-  # @param n_bytes[Fixnum, nil] how many bytes to read, or `nil` to read all the way to the end
+  # @param n_bytes[Integer, nil] how many bytes to read, or `nil` to read all the way to the end
   # @return [String] the read bytes
   def read(n_bytes = nil)
     # If the resource is empty there is nothing to read
@@ -62,7 +62,7 @@ class ZipKit::RemoteIO
   # Returns the current pointer position within the IO
   #
-  # @return [Fixnum]
+  # @return [Integer]
   def tell
     @pos
   end
@@ -74,7 +74,7 @@ class ZipKit::RemoteIO
   # @param range[Range] the HTTP range of data to fetch from remote
   # @return [String] the response body of the ranged request
   def request_range(range)
-    http = Net::HTTP.start(@uri.hostname, @uri.port)
+    http = Net::HTTP.start(@uri.hostname, @uri.port, use_ssl: @uri.scheme == "https")
     request = Net::HTTP::Get.new(@uri)
     request.range = range
     response = http.request(request)
@@ -91,7 +91,7 @@ class ZipKit::RemoteIO
   #
   # @return [Integer] the size of the remote resource, parsed either from Content-Length or Content-Range header
   def request_object_size
-    http = Net::HTTP.start(@uri.hostname, @uri.port)
+    http = Net::HTTP.start(@uri.hostname, @uri.port, use_ssl: @uri.scheme == "https")
     request = Net::HTTP::Get.new(@uri)
     request.range = 0..0
     response = http.request(request)

data/lib/zip_kit/size_estimator.rb CHANGED Viewed

@@ -24,7 +24,7 @@ class ZipKit::SizeEstimator
   #
   # @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
   # @return [Integer] the size of the resulting archive, in bytes
-  # @yieldparam [SizeEstimator] the estimator
+  # @yieldparam estimator[SizeEstimator] the estimator
   def self.estimate(**kwargs_for_streamer_new)
     streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
     estimator = new(streamer)
@@ -35,9 +35,12 @@ class ZipKit::SizeEstimator
   # Add a fake entry to the archive, to see how big it is going to be in the end.
   #
   # @param filename [String] the name of the file (filenames are variable-width in the ZIP)
-  # @param size [Fixnum] size of the uncompressed entry
-  # @param use_data_descriptor[Boolean] whether the entry uses a postfix
-  # data descriptor to specify size
+  # @param size [Integer] size of the uncompressed entry
+  # @param use_data_descriptor[Boolean] whether there is going to be a data descriptor written
+  #                                     after the entry body, to specify size.
+  #                                     You must enable this if you are going to be
+  #                                     using {Streamer#write_stored_file} as otherwise your
+  #                                     estimated size is not going to be accurate
   # @return self
   def add_stored_entry(filename:, size:, use_data_descriptor: false)
     @streamer.add_stored_entry(filename: filename,
@@ -54,10 +57,13 @@ class ZipKit::SizeEstimator
   # Add a fake entry to the archive, to see how big it is going to be in the end.
   #
   # @param filename [String] the name of the file (filenames are variable-width in the ZIP)
-  # @param uncompressed_size [Fixnum] size of the uncompressed entry
-  # @param compressed_size [Fixnum] size of the compressed entry
-  # @param use_data_descriptor[Boolean] whether the entry uses a postfix data
-  #                                     descriptor to specify size
+  # @param uncompressed_size [Integer] size of the uncompressed entry
+  # @param compressed_size [Integer] size of the compressed entry
+  # @param use_data_descriptor[Boolean] whether there is going to be a data descriptor written
+  #                                     after the entry body, to specify size.
+  #                                     You must enable this if you are going to be
+  #                                     using {Streamer#write_deflated_file} as otherwise your
+  #                                     estimated size is not going to be accurate
   # @return self
   def add_deflated_entry(filename:, uncompressed_size:, compressed_size:, use_data_descriptor: false)
     @streamer.add_deflated_entry(filename: filename,

data/lib/zip_kit/stream_crc32.rb CHANGED Viewed

@@ -16,7 +16,7 @@ class ZipKit::StreamCRC32
   # Compute a CRC32 value from an IO object. The object should respond to `read` and `eof?`
   #
   # @param io[IO] the IO to read the data from
-  # @return [Fixnum] the computed CRC32 value
+  # @return [Integer] the computed CRC32 value
   def self.from_io(io)
     # If we can specify the string capacity upfront we will not have to resize
     # the string during operation. This saves time but is only available on
@@ -43,7 +43,7 @@ class ZipKit::StreamCRC32
   # Returns the CRC32 value computed so far
   #
-  # @return [Fixnum] the updated CRC32 value for all the blobs so far
+  # @return [Integer] the updated CRC32 value for all the blobs so far
   def to_i
     @crc
   end
@@ -51,9 +51,9 @@ class ZipKit::StreamCRC32
   # Appends a known CRC32 value to the current one, and combines the
   # contained CRC32 value in-place.
   #
-  # @param crc32[Fixnum] the CRC32 value to append
-  # @param blob_size[Fixnum] the size of the daata the `crc32` is computed from
-  # @return [Fixnum] the updated CRC32 value for all the blobs so far
+  # @param crc32[Integer] the CRC32 value to append
+  # @param blob_size[Integer] the size of the daata the `crc32` is computed from
+  # @return [Integer] the updated CRC32 value for all the blobs so far
   def append(crc32, blob_size)
     @crc = Zlib.crc32_combine(@crc, crc32, blob_size)
   end

data/lib/zip_kit/streamer/heuristic.rb CHANGED Viewed

@@ -26,6 +26,7 @@ class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
     @bytes_deflated = 0
     @winner = nil
+    @started_closing = false
   end
   def <<(bytes)
@@ -40,6 +41,9 @@ class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
   end
   def close
+    return if @started_closing
+    @started_closing = true # started_closing because an exception may get raised inside close(), as we add an entry there
     decide unless @winner
     @winner.close
   end
@@ -47,6 +51,7 @@ class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
   private def decide
     # Finish and then close the deflater - it has likely buffered some data
     @bytes_deflated += @deflater.finish.bytesize until @deflater.finished?
     # If the deflated version is smaller than the stored one
     # - use deflate, otherwise stored
     ratio = @bytes_deflated / @buf.size.to_f
@@ -55,9 +60,12 @@ class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
     else
       @streamer.write_stored_file(@filename, **@write_file_options)
     end
     # Copy the buffered uncompressed data into the newly initialized writable
     @buf.rewind
     IO.copy_stream(@buf, @winner)
     @buf.truncate(0)
+  ensure
+    @deflater.close
   end
 end

data/lib/zip_kit/streamer.rb CHANGED Viewed

@@ -5,8 +5,12 @@ require "set"
 # Is used to write ZIP archives without having to read them back or to overwrite
 # data. It outputs into any object that supports `<<` or `write`, namely:
 #
-# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
-# for the `Streamer`.
+# * `Array` - will contain binary strings
+# * `File` - data will be written to it as it gets generated
+# * `IO` (`Socket`, `StringIO`) - data gets written into it
+# * `String` - in binary encoding and unfrozen - also makes a decent output target
+#
+# or anything else that responds to `#<<` or `#write`.
 #
 # You can also combine output through the `Streamer` with direct output to the destination,
 # all while preserving the correct offsets in the ZIP file structures. This allows usage
@@ -482,6 +486,10 @@ class ZipKit::Streamer
   # is likely already on the wire. However, excluding the entry from the central directory of the ZIP
   # file will allow better-behaved ZIP unarchivers to extract the entries which did store correctly,
   # provided they read the ZIP from the central directory and not straight-ahead.
+  # Rolling back does not perform any writes.
+  #
+  # `rollback!` gets called for you if an exception is raised inside the block of `write_file`,
+  # `write_deflated_file` and `write_stored_file`.
   #
   # @example
   #     zip.add_stored_entry(filename: "data.bin", size: 4.megabytes, crc32: the_crc)
@@ -493,14 +501,17 @@ class ZipKit::Streamer
   #     end
   # @return [Integer] position in the output stream / ZIP archive
   def rollback!
-    removed_entry = @files.pop
-    return @out.tell unless removed_entry
+    @files.pop if @remove_last_file_at_rollback
+    # Recreate the path set from remaining entries (PathSet does not support cheap deletes yet)
     @path_set.clear
     @files.each do |e|
       @path_set.add_directory_or_file_path(e.filename) unless e.filler?
     end
-    @files << Filler.new(@out.tell - removed_entry.local_header_offset)
+    # Create filler for the truncated or unusable local file entry that did get written into the output
+    filler_size_bytes = @out.tell - @offset_before_last_local_file_header
+    @files << Filler.new(filler_size_bytes)
     @out.tell
   end
@@ -554,6 +565,11 @@ class ZipKit::Streamer
     use_data_descriptor:,
     unix_permissions:
   )
+    # Set state needed for proper rollback later. If write_local_file_header
+    # does manage to write _some_ bytes, but fails later (we write in tiny bits sometimes)
+    # we should be able to create a filler from this offset on when we
+    @offset_before_last_local_file_header = @out.tell
+    @remove_last_file_at_rollback = false
     # Clean backslashes
     filename = remove_backslash(filename)
@@ -600,9 +616,11 @@ class ZipKit::Streamer
       mtime: e.mtime,
       filename: e.filename,
       storage_mode: e.storage_mode)
     e.bytes_used_for_local_header = @out.tell - e.local_header_offset
     @files << e
+    @remove_last_file_at_rollback = true
   end
   def remove_backslash(filename)

data/lib/zip_kit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module ZipKit
-  VERSION = "6.3.0"
+  VERSION = "6.3.2"
 end