RubyGems - zip_kit - Versions diffs - 6.2.0 → 6.2.2 - Mend

zip_kit 6.2.0 → 6.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +13 -1
data/CONTRIBUTING.md +2 -2
data/README.md +50 -31
data/examples/sinatra_application.rb +16 -0
data/lib/zip_kit/block_deflate.rb +0 -2
data/lib/zip_kit/block_write.rb +4 -1
data/lib/zip_kit/output_enumerator.rb +25 -5
data/lib/zip_kit/rails_streaming.rb +56 -15
data/lib/zip_kit/streamer.rb +11 -11
data/lib/zip_kit/version.rb +1 -1
data/lib/zip_kit/write_shovel.rb +2 -2
data/lib/zip_kit.rb +1 -0
data/rbi/zip_kit.rbi +71 -47
data/zip_kit.gemspec +7 -3
metadata +20 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
-  data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
+  metadata.gz: e1136ebba851638486c9e47150a8d706c49a2bbc0074f457c794582d8ce19089
+  data.tar.gz: 80de3edcb5bc748aaf855a7bf0b1f19439522c8efa4b97754f813fe9413bac2c
 SHA512:
-  metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
-  data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e
+  metadata.gz: 20c5922a4178f2068a4f06388b201bd263f01c387d308c2c6297feba1c05385d601072cae0451d59a4a0b4e1ba1e354a6fa7f622ff1b58daf70947e6991b1e82
+  data.tar.gz: c373972ec6980000b40d1808247759b44f317a7aa3795b406e02005412cf0687e0f2311e5809f011eb1fbc19e6b2b7eb2a6a8f036cafe27a2645f6476cf0c441

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,15 @@
+## 6.2.2
+* Make sure "zlib" gets required at the top, as it is used everywhere
+* Improve documentation
+* Make sure `zip_kit_stream` honors the custom `Content-Type` parameter
+* Add a streaming example with Sinatra (and add a Sinatra app to the test harness)
+## 6.2.1
+* Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
+* Make `BlockWrite` respond to `write` in addition to `<<`
 ## 6.2.0
 * Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
@@ -149,7 +161,7 @@
 ## 4.4.2
 * Add 2.4 to Travis rubies
-* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_kit/pull/14)
+* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
 ## 4.4.1

data/CONTRIBUTING.md CHANGED Viewed

@@ -106,11 +106,11 @@ project:
    ```bash
    # Clone your fork of the repo into the current directory
-   git clone git@github.com:WeTransfer/zip_kit.git
+   git clone git@github.com:julik/zip_kit.git
    # Navigate to the newly cloned directory
    cd zip_kit
    # Assign the original repo to a remote called "upstream"
-   git remote add upstream git@github.com:WeTransfer/zip_kit.git
+   git remote add upstream git@github.com:julik/zip_kit.git
    ```
 2. If you cloned a while ago, get the latest changes from upstream:

data/README.md CHANGED Viewed

@@ -55,11 +55,11 @@ If you want some more conveniences you can also use [zipline](https://github.com
 will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
 via HTTP.
-`RailsStreaming` will *not* use [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
-and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
-RSpec) should work normally with controller actions returning ZIPs.
+`RailsStreaming` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
+and will stream without it. See {ZipKit::RailsStreaming#zip_kit_stream} for more details on this. You can use it
+together with `Live` just fine if you need to.
-## Writing into other streaming destinations and through streaming wrappers
+## Writing into streaming destinations
 Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
 is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -69,23 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
 obj = bucket.object("big.zip")
 obj.upload_stream do |write_stream|
   ZipKit::Streamer.open(write_stream) do |zip|
-    zip.write_file("large.csv") do |sink|
-      CSV(sink) do |csv|
-        csv << ["Line", "Item"]
-        20_000.times do |n|
-          csv << [n, "Item number #{n}"]
-        end
+    zip.write_file("file.csv") do |sink|
+      File.open("large.csv", "rb") do |file_input|
+        IO.copy_stream(file_input, sink)
       end
     end
   end
 end
 ```
+## Writing through streaming wrappers
 Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
-output with [builder](https://github.com/jimweirich/builder#project-builder)
+output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
+every time a complete write call is done:
 ```ruby
-zip.write_file('report1.csv') do |sink|
+zip.write_file('employees.xml') do |sink|
   builder = Builder::XmlMarkup.new(target: sink, indent: 2)
   builder.people do
     Person.all.find_each do |person|
@@ -95,14 +95,30 @@ zip.write_file('report1.csv') do |sink|
 end
 ```
-and this output will be compressed and output into the ZIP file on the fly. zip_kit composes with any
-Ruby code that streams its output into a destination.
+The output will be compressed and output into the ZIP file on the fly. Same for CSV:
-## Create a ZIP file without size estimation, compress on-the-fly during writes
+```ruby
+zip.write_file('line_items.csv') do |sink|
+  CSV(sink) do |csv|
+    csv << ["Line", "Item"]
+    20_000.times do |n|
+      csv << [n, "Item number #{n}"]
+    end
+  end
+end
+```
+## Automatic storage mode (stored vs. deflated)
+The ZIP file format allows storage in both compressed and raw storage modes. The raw ("stored")
+mode does not require decompression and unarchives faster.
-Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
-memory inflation is going to be very constrained. Data will be written to destination at fairly regular
-intervals. Deflate compression will work best for things like text files. For example, here is how to
+ZipKit will buffer a small amount of output and attempt to compress it using deflate compression.
+If this turns out to be significantly smaller than raw data, it is then going to proceed with
+all further output using deflate compression. Memory use is going to be very modest, but it allows
+you to not have to think about the appropriate storage mode.
+Deflate compression will work great for JSONs, CSVs and other text- or text-like formats. For example, here is how to
 output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
 ```ruby
@@ -116,18 +132,16 @@ ZipKit::Streamer.open($stdout) do |zip|
 end
 ```
-Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
-since you do not know how large the compressed data segments are going to be.
+If you want to use specific storage modes, use `write_deflated_file` and `write_stored_file` instead of
+`write_file`.
 ## Send a ZIP from a Rack response
 zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
-by piece, and apply some amount of buffering as well. Note that you might want to wrap
-it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
-that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
-The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
-start executing once your response body starts getting iterated over - when actually sending
-the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
+by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
+and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
+the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
+over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
 ```ruby
 body = ZipKit::OutputEnumerator.new do | zip |
@@ -139,13 +153,16 @@ body = ZipKit::OutputEnumerator.new do | zip |
   end
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env)
-[200, headers, streaming_body]
+[200, body.streaming_http_headers, body]
 ```
 ## Send a ZIP file of known size, with correct headers
-Use the `SizeEstimator` to compute the correct size of the resulting archive.
+Sending a file with data descriptors is not always desirable - you don't really know how large your ZIP is going to be.
+If you want to present your users with proper download progress, you would need to set a `Content-Length` header - and
+know ahead of time how large your download is going to be. This can be done with ZipKit, provided you know how large
+the compressed versions of your file are going to be. Use the {ZipKit::SizeEstimator} to do the pre-calculation - it
+is not going to produce any large amounts of output, and will give you a to-the-byte value for your future archive:
 ```ruby
 bytesize = ZipKit::SizeEstimator.estimate do |z|
@@ -160,8 +177,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
   zip << read_file('myfile2.bin')
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env, content_length: bytesize)
-[200, headers, streaming_body]
+hh = zip_body.streaming_http_headers
+hh["Content-Length"] = bytesize.to_s
+[200, hh, zip_body]
 ```
 ## Writing ZIP files using the Streamer bypass

data/examples/sinatra_application.rb ADDED Viewed

@@ -0,0 +1,16 @@
+require "sinatra/base"
+class SinatraApp < Sinatra::Base
+  get "/" do
+    content_type :zip
+    stream do |out|
+      ZipKit::Streamer.open(out) do |z|
+        z.write_file(File.basename(__FILE__)) do |io|
+          File.open(__FILE__, "r") do |f|
+            IO.copy_stream(f, io)
+          end
+        end
+      end
+    end
+  end
+end

data/lib/zip_kit/block_deflate.rb CHANGED Viewed

@@ -1,7 +1,5 @@
 # frozen_string_literal: true
-require "zlib"
 # Permits Deflate compression in independent blocks. The workflow is as follows:
 #
 # * Run every block to compress through deflate_chunk, remove the header,

data/lib/zip_kit/block_write.rb CHANGED Viewed

@@ -17,9 +17,12 @@
 #     end
 #     [200, {}, MyRackResponse.new]
 class ZipKit::BlockWrite
+  include ZipKit::WriteShovel
   # Creates a new BlockWrite.
   #
   # @param block The block that will be called when this object receives the `<<` message
+  # @yieldparam bytes[String] A string in binary encoding which has just been written into the object
   def initialize(&block)
     @block = block
   end
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
   # @param buf[String] the string to write. Note that a zero-length String
   #    will not be forwarded to the block, as it has special meaning when used
   #    with chunked encoding (it indicates the end of the stream).
-  # @return self
+  # @return [ZipKit::BlockWrite]
   def <<(buf)
     # Zero-size output has a special meaning  when using chunked encoding
     return if buf.nil? || buf.bytesize.zero?

data/lib/zip_kit/output_enumerator.rb CHANGED Viewed

@@ -34,6 +34,15 @@ require "time" # for .httpdate
 #
 # to bypass things like `Rack::ETag` and the nginx buffering.
 class ZipKit::OutputEnumerator
+  # With HTTP output it is better to apply a small amount of buffering. While Streamer
+  # output does not buffer at all, the `OutputEnumerator` does as it is going to
+  # be used as a Rack response body. Applying some buffering helps reduce the number
+  # of syscalls for otherwise tiny writes, which relieves the app webserver from
+  # doing too much work managing those writes. While we recommend buffering, the
+  # buffer size is configurable via the constructor - so you can disable buffering
+  # if you really need to. While ZipKit ams not to buffer, in this instance this
+  # buffering is justified. See https://github.com/WeTransfer/zip_tricks/issues/78
+  # for the background on buffering.
   DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
   # Creates a new OutputEnumerator enumerator. The enumerator can be read from using `each`,
@@ -60,14 +69,11 @@ class ZipKit::OutputEnumerator
   #       ...
   #     end
   #
-  # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
-  # @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
-  #
   # @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
   # @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
   #     it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
   #     object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
-  #     but at block size boundaries or greater). Set it to 0 for unbuffered writes.
+  #     but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
   # @param blk a block that will receive the Streamer object when executing. The block will not be executed
   #     immediately but only once `each` is called on the OutputEnumerator
   def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
@@ -100,9 +106,14 @@ class ZipKit::OutputEnumerator
   end
   # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
+  # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
+  # particular response body getting served. You might want to override the headers with your particular
+  # ones - for example, specific content types are needed for files which are, technically, ZIP files
+  # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
+  # and ePubs.
   #
   # @return [Hash]
-  def streaming_http_headers
+  def self.streaming_http_headers
     _headers = {
       # We need to ensure Rack::ETag does not suddenly start buffering us, see
       # https://github.com/rack/rack/issues/1619#issuecomment-606315714
@@ -121,6 +132,15 @@ class ZipKit::OutputEnumerator
     }
   end
+  # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
+  # presized responses, but is now effectively a no-op.
+  #
+  # @see [ZipKit::OutputEnumerator.streaming_http_headers]
+  # @return [Hash]
+  def streaming_http_headers
+    self.class.streaming_http_headers
+  end
   # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
   # an object that can be used as a Rack response body. This method used to accept arguments
   # but will now just ignore them.

data/lib/zip_kit/rails_streaming.rb CHANGED Viewed

@@ -5,18 +5,29 @@ module ZipKit::RailsStreaming
   # Opens a {ZipKit::Streamer} and yields it to the caller. The output of the streamer
   # gets automatically forwarded to the Rails response stream. When the output completes,
   # the Rails response stream is going to be closed automatically.
+  #
+  # Note that there is an important difference in how this method works, depending whether
+  # you use it in a controller which includes `ActionController::Live` vs. one that does not.
+  # With a standard `ActionController` this method will assign a response body, but streaming
+  # will begin when your action method returns. With `ActionController::Live` the streaming
+  # will begin immediately, before the method returns. In all other aspects the method should
+  # stream correctly in both types of controllers.
+  #
+  # If you encounter buffering (streaming does not start for a very long time) you probably
+  # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
+  # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
+  # always possible. If you encounter buffering, examine your middleware stack and try to suss
+  # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
+  # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
+  #
   # @param filename[String] name of the file for the Content-Disposition header
   # @param type[String] the content type (MIME type) of the archive being output
   # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
-  # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
-  #     See {ZipKit::Streamer#initialize} for the full list of options.
+  # @param output_enumerator_options[Hash] options that will be passed to the OutputEnumerator - these include
+  #     options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
   # @yieldparam [ZipKit::Streamer] the streamer that can be written to
-  # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
-  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
-    # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
-    # first will also validate the Streamer options.
-    output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
+  # @return [Boolean] always returns true
+  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk)
     # We want some common headers for file sending. Rails will also set
     # self.sending_file = true for us when we call send_file_headers!
     send_file_headers!(type: type, filename: filename)
@@ -29,16 +40,46 @@ module ZipKit::RailsStreaming
       logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
     end
-    headers = output_enum.streaming_http_headers
+    headers = ZipKit::OutputEnumerator.streaming_http_headers
+    # Allow Rails headers to override ours. This is important if, for example, a content type gets
+    # set to something else than "application/zip"
+    response.headers.reverse_merge!(headers)
+    # The output enumerator yields chunks of bytes generated from the Streamer,
+    # with some buffering. See OutputEnumerator docs for more.
+    rack_zip_body = ZipKit::OutputEnumerator.new(**output_enumerator_options, &zip_streaming_blk)
-    # In rare circumstances (such as the app using Rack::ContentLength - which should normally
-    # not be used allow the user to force the use of the chunked encoding
+    # Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
+    # Rack::ContentLength is normally not in a Rails middleware stack, but it might get
+    # introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
+    # There is a recommendation to leave the chunked encoding to the app server, so that servers
+    # that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
+    # see https://github.com/julik/zip_kit/issues/7
+    # But it is not to be excluded that a user may need to force the chunked encoding to bypass
+    # some especially pesky Rack middleware that just would not cooperate. Those include
+    # Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
     if use_chunked_transfer_encoding
-      output_enum = ZipKit::RackChunkedBody.new(output_enum)
-      headers["Transfer-Encoding"] = "chunked"
+      response.headers["Transfer-Encoding"] = "chunked"
+      rack_zip_body = ZipKit::RackChunkedBody.new(rack_zip_body)
+    end
+    # Time for some branching, which mostly has to do with the 999 flavours of
+    # "how to make both Rails and Rack stream"
+    if self.class.ancestors.include?(ActionController::Live)
+      # If this controller includes Live it will not work correctly with a Rack
+      # response body assignment - the action will just hang. We need to read out the response
+      # body ourselves and write it into the Rails stream.
+      begin
+        rack_zip_body.each { |bytes| response.stream.write(bytes) }
+      ensure
+        response.stream.close
+      end
+    else
+      # Stream using a Rack body assigned to the ActionController response body
+      self.response_body = rack_zip_body
     end
-    response.headers.merge!(headers)
-    self.response_body = output_enum
+    true
   end
 end

data/lib/zip_kit/streamer.rb CHANGED Viewed

@@ -2,19 +2,19 @@
 require "set"
-# Is used to write streamed ZIP archives into the provided IO-ish object.
-# The output IO is never going to be rewound or seeked, so the output
-# of this object can be coupled directly to, say, a Rack output. The
-# output can also be a String, Array or anything that responds to `<<`.
+# Is used to write ZIP archives without having to read them back or to overwrite
+# data. It outputs into any object that supports `<<` or `write`, namely:
 #
-# Allows for splicing raw files (for "stored" entries without compression)
-# and splicing of deflated files (for "deflated" storage mode).
+# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
+# for the `Streamer`.
 #
-# For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
-# before the writing of the entry body starts.
+# You can also combine output through the `Streamer` with direct output to the destination,
+# all while preserving the correct offsets in the ZIP file structures. This allows usage
+# of `sendfile()` or socket `splice()` calls for "through" proxying.
 #
-# Any object that responds to `<<` can be used as the Streamer target - you can use
-# a String, an Array, a Socket or a File, at your leisure.
+# If you want to avoid data descriptors - or write data bypassing the Streamer -
+# you need to know the CRC32 (as a uint) and the filesize upfront,
+# before the writing of the entry body starts.
 #
 # ## Using the Streamer with runtime compression
 #
@@ -34,7 +34,7 @@ require "set"
 #       end
 #     end
 #
-# The central directory will be written automatically at the end of the block.
+# The central directory will be written automatically at the end of the `open` block.
 #
 # ## Using the Streamer with entries of known size and having a known CRC32 checksum
 #

data/lib/zip_kit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module ZipKit
-  VERSION = "6.2.0"
+  VERSION = "6.2.2"
 end

data/lib/zip_kit/write_shovel.rb CHANGED Viewed

@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
   # Writes the given data to the output stream. Allows the object to be used as
   # a target for `IO.copy_stream(from, to)`
   #
-  # @param d[String] the binary string to write (part of the uncompressed file)
-  # @return [Fixnum] the number of bytes written
+  # @param bytes[String] the binary string to write (part of the uncompressed file)
+  # @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
   def write(bytes)
     self << bytes
     bytes.bytesize

data/lib/zip_kit.rb CHANGED Viewed

@@ -1,6 +1,7 @@
 # frozen_string_literal: true
 require_relative "zip_kit/version"
+require "zlib"
 module ZipKit
   autoload :OutputEnumerator, File.dirname(__FILE__) + "/zip_kit/rack_body.rb"

data/rbi/zip_kit.rbi CHANGED Viewed

@@ -1,6 +1,6 @@
 # typed: strong
 module ZipKit
-  VERSION = T.let("6.2.0", T.untyped)
+  VERSION = T.let("6.2.2", T.untyped)
   # A ZIP archive contains a flat list of entries. These entries can implicitly
   # create directories when the archive is expanded. For example, an entry with
@@ -100,19 +100,19 @@ module ZipKit
     end
   end
-  # Is used to write streamed ZIP archives into the provided IO-ish object.
-  # The output IO is never going to be rewound or seeked, so the output
-  # of this object can be coupled directly to, say, a Rack output. The
-  # output can also be a String, Array or anything that responds to `<<`.
+  # Is used to write ZIP archives without having to read them back or to overwrite
+  # data. It outputs into any object that supports `<<` or `write`, namely:
   #
-  # Allows for splicing raw files (for "stored" entries without compression)
-  # and splicing of deflated files (for "deflated" storage mode).
+  # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
+  # for the `Streamer`.
   #
-  # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
-  # before the writing of the entry body starts.
+  # You can also combine output through the `Streamer` with direct output to the destination,
+  # all while preserving the correct offsets in the ZIP file structures. This allows usage
+  # of `sendfile()` or socket `splice()` calls for "through" proxying.
   #
-  # Any object that responds to `<<` can be used as the Streamer target - you can use
-  # a String, an Array, a Socket or a File, at your leisure.
+  # If you want to avoid data descriptors - or write data bypassing the Streamer -
+  # you need to know the CRC32 (as a uint) and the filesize upfront,
+  # before the writing of the entry body starts.
   #
   # ## Using the Streamer with runtime compression
   #
@@ -132,7 +132,7 @@ module ZipKit
   #       end
   #     end
   #
-  # The central directory will be written automatically at the end of the block.
+  # The central directory will be written automatically at the end of the `open` block.
   #
   # ## Using the Streamer with entries of known size and having a known CRC32 checksum
   #
@@ -563,13 +563,12 @@ module ZipKit
     sig { params(filename: T.untyped).returns(T.untyped) }
     def remove_backslash(filename); end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
@@ -678,13 +677,12 @@ module ZipKit
       sig { returns(T.untyped) }
       def close; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -748,13 +746,12 @@ module ZipKit
       sig { returns(T::Hash[T.untyped, T.untyped]) }
       def finish; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -787,13 +784,12 @@ module ZipKit
       sig { returns(T::Hash[T.untyped, T.untyped]) }
       def finish; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -1107,19 +1103,28 @@ end, T.untyped)
   #     end
   #     [200, {}, MyRackResponse.new]
   class BlockWrite
+    include ZipKit::WriteShovel
     # Creates a new BlockWrite.
     #
     # _@param_ `block` — The block that will be called when this object receives the `<<` message
-    sig { params(block: T.untyped).void }
+    sig { params(block: T.proc.params(bytes: String).void).void }
     def initialize(&block); end
     # Sends a string through to the block stored in the BlockWrite.
     #
     # _@param_ `buf` — the string to write. Note that a zero-length String will not be forwarded to the block, as it has special meaning when used with chunked encoding (it indicates the end of the stream).
-    #
-    # _@return_ — self
-    sig { params(buf: String).returns(T.untyped) }
+    sig { params(buf: String).returns(ZipKit::BlockWrite) }
     def <<(buf); end
+    # Writes the given data to the output stream. Allows the object to be used as
+    # a target for `IO.copy_stream(from, to)`
+    #
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
+    #
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
+    sig { params(bytes: String).returns(Fixnum) }
+    def write(bytes); end
   end
   # A very barebones ZIP file reader. Is made for maximum interoperability, but at the same
@@ -1657,13 +1662,12 @@ end, T.untyped)
     sig { params(crc32: Fixnum, blob_size: Fixnum).returns(Fixnum) }
     def append(crc32, blob_size); end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1728,13 +1732,12 @@ end, T.untyped)
   # "IO-ish" things to also respond to `write`? This is what this module does.
   # Jim would be proud. We miss you, Jim.
   module WriteShovel
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1960,13 +1963,12 @@ end, T.untyped)
     sig { returns(T.untyped) }
     def tell; end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1977,25 +1979,39 @@ end, T.untyped)
     # gets automatically forwarded to the Rails response stream. When the output completes,
     # the Rails response stream is going to be closed automatically.
     #
+    # Note that there is an important difference in how this method works, depending whether
+    # you use it in a controller which includes `ActionController::Live` vs. one that does not.
+    # With a standard `ActionController` this method will assign a response body, but streaming
+    # will begin when your action method returns. With `ActionController::Live` the streaming
+    # will begin immediately, before the method returns. In all other aspects the method should
+    # stream correctly in both types of controllers.
+    #
+    # If you encounter buffering (streaming does not start for a very long time) you probably
+    # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
+    # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
+    # always possible. If you encounter buffering, examine your middleware stack and try to suss
+    # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
+    # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
+    #
     # _@param_ `filename` — name of the file for the Content-Disposition header
     #
     # _@param_ `type` — the content type (MIME type) of the archive being output
     #
     # _@param_ `use_chunked_transfer_encoding` — whether to forcibly encode output as chunked. Normally you should not need this.
     #
-    # _@param_ `zip_streamer_options` — options that will be passed to the Streamer. See {ZipKit::Streamer#initialize} for the full list of options.
+    # _@param_ `output_enumerator_options` — options that will be passed to the OutputEnumerator - these include options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
     #
-    # _@return_ — The output enumerator assigned to the response body
+    # _@return_ — always returns true
     sig do
       params(
         filename: String,
         type: String,
         use_chunked_transfer_encoding: T::Boolean,
-        zip_streamer_options: T::Hash[T.untyped, T.untyped],
+        output_enumerator_options: T::Hash[T.untyped, T.untyped],
         zip_streaming_blk: T.proc.params(the: ZipKit::Streamer).void
-      ).returns(ZipKit::OutputEnumerator)
+      ).returns(T::Boolean)
     end
-    def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk); end
+    def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk); end
   end
   # The output enumerator makes it possible to "pull" from a ZipKit streamer
@@ -2056,15 +2072,11 @@ end, T.untyped)
     #       ...
     #     end
     #
-    # _@param_ `kwargs_for_new` — keyword arguments for {Streamer.new}
-    #
     # _@param_ `streamer_options` — options for Streamer, see {ZipKit::Streamer.new}
     #
-    # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set it to 0 for unbuffered writes.
+    # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
     #
     # _@param_ `blk` — a block that will receive the Streamer object when executing. The block will not be executed immediately but only once `each` is called on the OutputEnumerator
-    #
-    # _@return_ — the enumerator you can read bytestrings of the ZIP from by calling `each`
     sig { params(write_buffer_size: Integer, streamer_options: T::Hash[T.untyped, T.untyped], blk: T.untyped).void }
     def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk); end
@@ -2083,6 +2095,18 @@ end, T.untyped)
     def each; end
     # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
+    # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
+    # particular response body getting served. You might want to override the headers with your particular
+    # ones - for example, specific content types are needed for files which are, technically, ZIP files
+    # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
+    # and ePubs.
+    sig { returns(T::Hash[T.untyped, T.untyped]) }
+    def self.streaming_http_headers; end
+    # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
+    # presized responses, but is now effectively a no-op.
+    #
+    # _@see_ `[ZipKit::OutputEnumerator.streaming_http_headers]`
     sig { returns(T::Hash[T.untyped, T.untyped]) }
     def streaming_http_headers; end

data/zip_kit.gemspec CHANGED Viewed

@@ -7,7 +7,7 @@ Gem::Specification.new do |spec|
   spec.version = ZipKit::VERSION
   spec.authors = ["Julik Tarkhanov", "Noah Berman", "Dmitry Tymchuk", "David Bosveld", "Felix Bünemann"]
   spec.email = ["me@julik.nl"]
+  spec.license = "MIT"
   spec.summary = "Stream out ZIP files from Ruby. Successor to zip_tricks."
   spec.description = "Stream out ZIP files from Ruby. Successor to zip_tricks."
   spec.homepage = "https://github.com/julik/zip_kit"
@@ -23,9 +23,12 @@ Gem::Specification.new do |spec|
   spec.require_paths = ["lib"]
   spec.add_development_dependency "bundler"
-  spec.add_development_dependency "rubyzip", "~> 1"
-  spec.add_development_dependency "rack" # For tests where we spin up a server
+  # zip_kit does not use any runtime dependencies (besides zlib). However, for testing
+  # things quite a few things are used - and for a good reason.
+  spec.add_development_dependency "rubyzip", "~> 1" # We test our output with _another_ ZIP library, which is the way to go here
+  spec.add_development_dependency "rack" # For tests where we spin up a server. Both for streaming out and for testing reads over HTTP
   spec.add_development_dependency "rake", "~> 12.2"
   spec.add_development_dependency "rspec", "~> 3"
   spec.add_development_dependency "rspec-mocks", "~> 3.10", ">= 3.10.2" # ruby 3 compatibility
@@ -39,5 +42,6 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency "puma"
   spec.add_development_dependency "actionpack", "~> 5" # For testing RailsStreaming against an actual Rails controller
   spec.add_development_dependency "nokogiri", "~> 1", ">= 1.13" # Rails 5 does by mistake use an older Nokogiri otherwise
+  spec.add_development_dependency "sinatra"
   spec.add_development_dependency "sord"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: zip_kit
 version: !ruby/object:Gem::Version
-  version: 6.2.0
+  version: 6.2.2
 platform: ruby
 authors:
 - Julik Tarkhanov
@@ -12,7 +12,7 @@ authors:
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2024-03-11 00:00:00.000000000 Z
+date: 2024-03-27 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -250,6 +250,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '1.13'
+- !ruby/object:Gem::Dependency
+  name: sinatra
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 - !ruby/object:Gem::Dependency
   name: sord
   requirement: !ruby/object:Gem::Requirement
@@ -292,6 +306,7 @@ files:
 - examples/parallel_compression_with_block_deflate.rb
 - examples/rack_application.rb
 - examples/s3_upload.rb
+- examples/sinatra_application.rb
 - lib/zip_kit.rb
 - lib/zip_kit/block_deflate.rb
 - lib/zip_kit/block_write.rb
@@ -324,7 +339,8 @@ files:
 - rbi/zip_kit.rbi
 - zip_kit.gemspec
 homepage: https://github.com/julik/zip_kit
-licenses: []
+licenses:
+- MIT
 metadata:
   allowed_push_host: https://rubygems.org
 post_install_message:
@@ -342,7 +358,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.3.7
+rubygems_version: 3.1.6
 signing_key:
 specification_version: 4
 summary: Stream out ZIP files from Ruby. Successor to zip_tricks.