RubyGems - zip_kit - Versions diffs - 6.2.0 → 6.2.2 - Mend

zip_kit 6.2.0 → 6.2.2

Files changed (16) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +13 -1
data/CONTRIBUTING.md +2 -2
data/README.md +50 -31
data/examples/sinatra_application.rb +16 -0
data/lib/zip_kit/block_deflate.rb +0 -2
data/lib/zip_kit/block_write.rb +4 -1
data/lib/zip_kit/output_enumerator.rb +25 -5
data/lib/zip_kit/rails_streaming.rb +56 -15
data/lib/zip_kit/streamer.rb +11 -11
data/lib/zip_kit/version.rb +1 -1
data/lib/zip_kit/write_shovel.rb +2 -2
data/lib/zip_kit.rb +1 -0
data/rbi/zip_kit.rbi +71 -47
data/zip_kit.gemspec +7 -3
metadata +20 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
-  data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
+  metadata.gz: e1136ebba851638486c9e47150a8d706c49a2bbc0074f457c794582d8ce19089
+  data.tar.gz: 80de3edcb5bc748aaf855a7bf0b1f19439522c8efa4b97754f813fe9413bac2c
 SHA512:
-  metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
-  data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e
+  metadata.gz: 20c5922a4178f2068a4f06388b201bd263f01c387d308c2c6297feba1c05385d601072cae0451d59a4a0b4e1ba1e354a6fa7f622ff1b58daf70947e6991b1e82
+  data.tar.gz: c373972ec6980000b40d1808247759b44f317a7aa3795b406e02005412cf0687e0f2311e5809f011eb1fbc19e6b2b7eb2a6a8f036cafe27a2645f6476cf0c441

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,15 @@
+## 6.2.2
+* Make sure "zlib" gets required at the top, as it is used everywhere
+* Improve documentation
+* Make sure `zip_kit_stream` honors the custom `Content-Type` parameter
+* Add a streaming example with Sinatra (and add a Sinatra app to the test harness)
+## 6.2.1
+* Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
+* Make `BlockWrite` respond to `write` in addition to `<<`
 ## 6.2.0
 * Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
@@ -149,7 +161,7 @@
 ## 4.4.2
 * Add 2.4 to Travis rubies
-* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_kit/pull/14)
+* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
 ## 4.4.1

data/CONTRIBUTING.md CHANGED Viewed

@@ -106,11 +106,11 @@ project:
    ```bash
    # Clone your fork of the repo into the current directory
-   git clone git@github.com:WeTransfer/zip_kit.git
+   git clone git@github.com:julik/zip_kit.git
    # Navigate to the newly cloned directory
    cd zip_kit
    # Assign the original repo to a remote called "upstream"
-   git remote add upstream git@github.com:WeTransfer/zip_kit.git
+   git remote add upstream git@github.com:julik/zip_kit.git
    ```
 2. If you cloned a while ago, get the latest changes from upstream:

data/README.md CHANGED Viewed

@@ -55,11 +55,11 @@ If you want some more conveniences you can also use [zipline](https://github.com
 will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
 via HTTP.
-`RailsStreaming` will *not* use [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
-and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
-RSpec) should work normally with controller actions returning ZIPs.
+`RailsStreaming` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
+and will stream without it. See {ZipKit::RailsStreaming#zip_kit_stream} for more details on this. You can use it
+together with `Live` just fine if you need to.
-## Writing into other streaming destinations and through streaming wrappers
+## Writing into streaming destinations
 Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
 is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -69,23 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
 obj = bucket.object("big.zip")
 obj.upload_stream do |write_stream|
   ZipKit::Streamer.open(write_stream) do |zip|
-    zip.write_file("large.csv") do |sink|
-      CSV(sink) do |csv|
-        csv << ["Line", "Item"]
-        20_000.times do |n|
-          csv << [n, "Item number #{n}"]
-        end
+    zip.write_file("file.csv") do |sink|
+      File.open("large.csv", "rb") do |file_input|
+        IO.copy_stream(file_input, sink)
       end
     end
   end
 end
 ```
+## Writing through streaming wrappers
 Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
-output with [builder](https://github.com/jimweirich/builder#project-builder)
+output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
+every time a complete write call is done:
 ```ruby
-zip.write_file('report1.csv') do |sink|
+zip.write_file('employees.xml') do |sink|
   builder = Builder::XmlMarkup.new(target: sink, indent: 2)
   builder.people do
     Person.all.find_each do |person|
@@ -95,14 +95,30 @@ zip.write_file('report1.csv') do |sink|
 end
 ```
-and this output will be compressed and output into the ZIP file on the fly. zip_kit composes with any
-Ruby code that streams its output into a destination.
+The output will be compressed and output into the ZIP file on the fly. Same for CSV:
-## Create a ZIP file without size estimation, compress on-the-fly during writes
+```ruby
+zip.write_file('line_items.csv') do |sink|
+  CSV(sink) do |csv|
+    csv << ["Line", "Item"]
+    20_000.times do |n|
+      csv << [n, "Item number #{n}"]
+    end
+  end
+end
+```
+## Automatic storage mode (stored vs. deflated)
+The ZIP file format allows storage in both compressed and raw storage modes. The raw ("stored")
+mode does not require decompression and unarchives faster.
-Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
-memory inflation is going to be very constrained. Data will be written to destination at fairly regular
-intervals. Deflate compression will work best for things like text files. For example, here is how to
+ZipKit will buffer a small amount of output and attempt to compress it using deflate compression.
+If this turns out to be significantly smaller than raw data, it is then going to proceed with
+all further output using deflate compression. Memory use is going to be very modest, but it allows
+you to not have to think about the appropriate storage mode.
+Deflate compression will work great for JSONs, CSVs and other text- or text-like formats. For example, here is how to
 output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
 ```ruby
@@ -116,18 +132,16 @@ ZipKit::Streamer.open($stdout) do |zip|
 end
 ```
-Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
-since you do not know how large the compressed data segments are going to be.
+If you want to use specific storage modes, use `write_deflated_file` and `write_stored_file` instead of
+`write_file`.
 ## Send a ZIP from a Rack response
 zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
-by piece, and apply some amount of buffering as well. Note that you might want to wrap
-it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
-that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
-The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
-start executing once your response body starts getting iterated over - when actually sending
-the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
+by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
+and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
+the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
+over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
 ```ruby
 body = ZipKit::OutputEnumerator.new do | zip |
@@ -139,13 +153,16 @@ body = ZipKit::OutputEnumerator.new do | zip |
   end
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env)
-[200, headers, streaming_body]
+[200, body.streaming_http_headers, body]
 ```
 ## Send a ZIP file of known size, with correct headers
-Use the `SizeEstimator` to compute the correct size of the resulting archive.
+Sending a file with data descriptors is not always desirable - you don't really know how large your ZIP is going to be.
+If you want to present your users with proper download progress, you would need to set a `Content-Length` header - and
+know ahead of time how large your download is going to be. This can be done with ZipKit, provided you know how large
+the compressed versions of your file are going to be. Use the {ZipKit::SizeEstimator} to do the pre-calculation - it
+is not going to produce any large amounts of output, and will give you a to-the-byte value for your future archive:
 ```ruby
 bytesize = ZipKit::SizeEstimator.estimate do |z|
@@ -160,8 +177,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
   zip << read_file('myfile2.bin')
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env, content_length: bytesize)
-[200, headers, streaming_body]
+hh = zip_body.streaming_http_headers
+hh["Content-Length"] = bytesize.to_s
+[200, hh, zip_body]
 ```
 ## Writing ZIP files using the Streamer bypass

data/examples/sinatra_application.rb ADDED Viewed

@@ -0,0 +1,16 @@
+require "sinatra/base"
+class SinatraApp < Sinatra::Base
+  get "/" do
+    content_type :zip
+    stream do |out|
+      ZipKit::Streamer.open(out) do |z|
+        z.write_file(File.basename(__FILE__)) do |io|
+          File.open(__FILE__, "r") do |f|
+            IO.copy_stream(f, io)
+          end
+        end
+      end
+    end
+  end
+end

data/lib/zip_kit/block_deflate.rb CHANGED Viewed

@@ -1,7 +1,5 @@
 # frozen_string_literal: true
-require "zlib"
 # Permits Deflate compression in independent blocks. The workflow is as follows:
 #
 # * Run every block to compress through deflate_chunk, remove the header,

data/lib/zip_kit/block_write.rb CHANGED Viewed

@@ -17,9 +17,12 @@
 #     end
 #     [200, {}, MyRackResponse.new]
 class ZipKit::BlockWrite
+  include ZipKit::WriteShovel
   # Creates a new BlockWrite.
   #
   # @param block The block that will be called when this object receives the `<<` message
+  # @yieldparam bytes[String] A string in binary encoding which has just been written into the object
   def initialize(&block)
     @block = block
   end
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
   # @param buf[String] the string to write. Note that a zero-length String
   #    will not be forwarded to the block, as it has special meaning when used
   #    with chunked encoding (it indicates the end of the stream).
-  # @return self
+  # @return [ZipKit::BlockWrite]
   def <<(buf)
     # Zero-size output has a special meaning  when using chunked encoding
     return if buf.nil? || buf.bytesize.zero?

data/lib/zip_kit/output_enumerator.rb CHANGED Viewed

@@ -34,6 +34,15 @@ require "time" # for .httpdate
 #
 # to bypass things like `Rack::ETag` and the nginx buffering.
 class ZipKit::OutputEnumerator
+  # With HTTP output it is better to apply a small amount of buffering. While Streamer
+  # output does not buffer at all, the `OutputEnumerator` does as it is going to
+  # be used as a Rack response body. Applying some buffering helps reduce the number
+  # of syscalls for otherwise tiny writes, which relieves the app webserver from
+  # doing too much work managing those writes. While we recommend buffering, the
+  # buffer size is configurable via the constructor - so you can disable buffering
+  # if you really need to. While ZipKit ams not to buffer, in this instance this
+  # buffering is justified. See https://github.com/WeTransfer/zip_tricks/issues/78
+  # for the background on buffering.
   DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
   # Creates a new OutputEnumerator enumerator. The enumerator can be read from using `each`,
@@ -60,14 +69,11 @@ class ZipKit::OutputEnumerator
   #       ...
   #     end
   #
-  # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
-  # @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
-  #
   # @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
   # @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
   #     it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
   #     object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
-  #     but at block size boundaries or greater). Set it to 0 for unbuffered writes.
+  #     but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
   # @param blk a block that will receive the Streamer object when executing. The block will not be executed
   #     immediately but only once `each` is called on the OutputEnumerator
   def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
@@ -100,9 +106,14 @@ class ZipKit::OutputEnumerator
   end
   # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
+  # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
+  # particular response body getting served. You might want to override the headers with your particular
+  # ones - for example, specific content types are needed for files which are, technically, ZIP files
+  # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
+  # and ePubs.
   #
   # @return [Hash]
-  def streaming_http_headers
+  def self.streaming_http_headers
     _headers = {
       # We need to ensure Rack::ETag does not suddenly start buffering us, see
       # https://github.com/rack/rack/issues/1619#issuecomment-606315714
@@ -121,6 +132,15 @@ class ZipKit::OutputEnumerator
     }
   end
+  # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
+  # presized responses, but is now effectively a no-op.
+  #
+  # @see [ZipKit::OutputEnumerator.streaming_http_headers]
+  # @return [Hash]
+  def streaming_http_headers
+    self.class.streaming_http_headers
+  end
   # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
   # an object that can be used as a Rack response body. This method used to accept arguments
   # but will now just ignore them.

data/lib/zip_kit/rails_streaming.rb CHANGED Viewed

@@ -5,18 +5,29 @@ module ZipKit::RailsStreaming
   # Opens a {ZipKit::Streamer} and yields it to the caller. The output of the streamer
   # gets automatically forwarded to the Rails response stream. When the output completes,
   # the Rails response stream is going to be closed automatically.
+  #
+  # Note that there is an important difference in how this method works, depending whether
+  # you use it in a controller which includes `ActionController::Live` vs. one that does not.
+  # With a standard `ActionController` this method will assign a response body, but streaming
+  # will begin when your action method returns. With `ActionController::Live` the streaming
+  # will begin immediately, before the method returns. In all other aspects the method should
+  # stream correctly in both types of controllers.
+  #
+  # If you encounter buffering (streaming does not start for a very long time) you probably
+  # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
+  # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
+  # always possible. If you encounter buffering, examine your middleware stack and try to suss
+  # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
+  # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
+  #
   # @param filename[String] name of the file for the Content-Disposition header
   # @param type[String] the content type (MIME type) of the archive being output
   # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
-  # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
-  #     See {ZipKit::Streamer#initialize} for the full list of options.
+  # @param output_enumerator_options[Hash] options that will be passed to the OutputEnumerator - these include
+  #     options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
   # @yieldparam [ZipKit::Streamer] the streamer that can be written to
-  # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
-  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
-    # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
-    # first will also validate the Streamer options.
-    output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
+  # @return [Boolean] always returns true
+  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk)
     # We want some common headers for file sending. Rails will also set
     # self.sending_file = true for us when we call send_file_headers!
     send_file_headers!(type: type, filename: filename)
@@ -29,16 +40,46 @@ module ZipKit::RailsStreaming
       logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
     end
-    headers = output_enum.streaming_http_headers
+    headers = ZipKit::OutputEnumerator.streaming_http_headers
+    # Allow Rails headers to override ours. This is important if, for example, a content type gets
+    # set to something else than "application/zip"
+    response.headers.reverse_merge!(headers)
+    # The output enumerator yields chunks of bytes generated from the Streamer,
+    # with some buffering. See OutputEnumerator docs for more.
+    rack_zip_body = ZipKit::OutputEnumerator.new(**output_enumerator_options, &zip_streaming_blk)
-    # In rare circumstances (such as the app using Rack::ContentLength - which should normally
-    # not be used allow the user to force the use of the chunked encoding
+    # Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
+    # Rack::ContentLength is normally not in a Rails middleware stack, but it might get
+    # introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
+    # There is a recommendation to leave the chunked encoding to the app server, so that servers
+    # that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
+    # see https://github.com/julik/zip_kit/issues/7
+    # But it is not to be excluded that a user may need to force the chunked encoding to bypass
+    # some especially pesky Rack middleware that just would not cooperate. Those include
+    # Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
     if use_chunked_transfer_encoding
-      output_enum = ZipKit::RackChunkedBody.new(output_enum)
-      headers["Transfer-Encoding"] = "chunked"
+      response.headers["Transfer-Encoding"] = "chunked"
+      rack_zip_body = ZipKit::RackChunkedBody.new(rack_zip_body)
+    end
+    # Time for some branching, which mostly has to do with the 999 flavours of
+    # "how to make both Rails and Rack stream"
+    if self.class.ancestors.include?(ActionController::Live)
+      # If this controller includes Live it will not work correctly with a Rack
+      # response body assignment - the action will just hang. We need to read out the response
+      # body ourselves and write it into the Rails stream.
+      begin
+        rack_zip_body.each { |bytes| response.stream.write(bytes) }
+      ensure
+        response.stream.close
+      end
+    else
+      # Stream using a Rack body assigned to the ActionController response body
+      self.response_body = rack_zip_body
     end
-    response.headers.merge!(headers)
-    self.response_body = output_enum
+    true
   end
 end

data/lib/zip_kit/streamer.rb CHANGED Viewed

@@ -2,19 +2,19 @@
 require "set"
-# Is used to write streamed ZIP archives into the provided IO-ish object.
-# The output IO is never going to be rewound or seeked, so the output
-# of this object can be coupled directly to, say, a Rack output. The
-# output can also be a String, Array or anything that responds to `<<`.
+# Is used to write ZIP archives without having to read them back or to overwrite
+# data. It outputs into any object that supports `<<` or `write`, namely:
 #
-# Allows for splicing raw files (for "stored" entries without compression)
-# and splicing of deflated files (for "deflated" storage mode).
+# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
+# for the `Streamer`.
 #
-# For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
-# before the writing of the entry body starts.
+# You can also combine output through the `Streamer` with direct output to the destination,
+# all while preserving the correct offsets in the ZIP file structures. This allows usage
+# of `sendfile()` or socket `splice()` calls for "through" proxying.
 #
-# Any object that responds to `<<` can be used as the Streamer target - you can use
-# a String, an Array, a Socket or a File, at your leisure.
+# If you want to avoid data descriptors - or write data bypassing the Streamer -
+# you need to know the CRC32 (as a uint) and the filesize upfront,
+# before the writing of the entry body starts.
 #
 # ## Using the Streamer with runtime compression
 #
@@ -34,7 +34,7 @@ require "set"
 #       end
 #     end
 #
-# The central directory will be written automatically at the end of the block.
+# The central directory will be written automatically at the end of the `open` block.
 #
 # ## Using the Streamer with entries of known size and having a known CRC32 checksum
 #

data/lib/zip_kit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module ZipKit
-  VERSION = "6.2.0"
+  VERSION = "6.2.2"
 end

data/lib/zip_kit/write_shovel.rb CHANGED Viewed

@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
   # Writes the given data to the output stream. Allows the object to be used as
   # a target for `IO.copy_stream(from, to)`
   #
-  # @param d[String] the binary string to write (part of the uncompressed file)
-  # @return [Fixnum] the number of bytes written
+  # @param bytes[String] the binary string to write (part of the uncompressed file)
+  # @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
   def write(bytes)
     self << bytes
     bytes.bytesize

data/lib/zip_kit.rb CHANGED Viewed

@@ -1,6 +1,7 @@
 # frozen_string_literal: true
 require_relative "zip_kit/version"
+require "zlib"
 module ZipKit
   autoload :OutputEnumerator, File.dirname(__FILE__) + "/zip_kit/rack_body.rb"

data/rbi/zip_kit.rbi CHANGED Viewed

@@ -1,6 +1,6 @@
 # typed: strong
 module ZipKit
-  VERSION = T.let("6.2.0", T.untyped)
+  VERSION = T.let("6.2.2", T.untyped)
   # A ZIP archive contains a flat list of entries. These entries can implicitly
   # create directories when the archive is expanded. For example, an entry with
@@ -100,19 +100,19 @@ module ZipKit
     end
   end
-  # Is used to write streamed ZIP archives into the provided IO-ish object.
-  # The output IO is never going to be rewound or seeked, so the output
-  # of this object can be coupled directly to, say, a Rack output. The
-  # output can also be a String, Array or anything that responds to `<<`.
+  # Is used to write ZIP archives without having to read them back or to overwrite
+  # data. It outputs into any object that supports `<<` or `write`, namely:
   #
-  # Allows for splicing raw files (for "stored" entries without compression)
-  # and splicing of deflated files (for "deflated" storage mode).
+  # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
+  # for the `Streamer`.
   #
-  # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
-  # before the writing of the entry body starts.
+  # You can also combine output through the `Streamer` with direct output to the destination,
+  # all while preserving the correct offsets in the ZIP file structures. This allows usage
+  # of `sendfile()` or socket `splice()` calls for "through" proxying.
   #
-  # Any object that responds to `<<` can be used as the Streamer target - you can use
-  # a String, an Array, a Socket or a File, at your leisure.
+  # If you want to avoid data descriptors - or write data bypassing the Streamer -
+  # you need to know the CRC32 (as a uint) and the filesize upfront,
+  # before the writing of the entry body starts.
   #
   # ## Using the Streamer with runtime compression
   #
@@ -132,7 +132,7 @@ module ZipKit
   #       end
   #     end
   #
-  # The central directory will be written automatically at the end of the block.
+  # The central directory will be written automatically at the end of the `open` block.
   #
   # ## Using the Streamer with entries of known size and having a known CRC32 checksum
   #
@@ -563,13 +563,12 @@ module ZipKit
     sig { params(filename: T.untyped).returns(T.untyped) }
     def remove_backslash(filename); end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
@@ -678,13 +677,12 @@ module ZipKit
       sig { returns(T.untyped) }
       def close; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -748,13 +746,12 @@ module ZipKit
       sig { returns(T::Hash[T.untyped, T.untyped]) }
       def finish; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -787,13 +784,12 @@ module ZipKit
       sig { returns(T::Hash[T.untyped, T.untyped]) }
       def finish; end
-      # sord infer - argument name in single @param inferred as "bytes"
       # Writes the given data to the output stream. Allows the object to be used as
       # a target for `IO.copy_stream(from, to)`
       #
-      # _@param_ `d` — the binary string to write (part of the uncompressed file)
+      # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
       #
-      # _@return_ — the number of bytes written
+      # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
       sig { params(bytes: String).returns(Fixnum) }
       def write(bytes); end
     end
@@ -1107,19 +1103,28 @@ end, T.untyped)
   #     end
   #     [200, {}, MyRackResponse.new]
   class BlockWrite
+    include ZipKit::WriteShovel
     # Creates a new BlockWrite.
     #
     # _@param_ `block` — The block that will be called when this object receives the `<<` message
-    sig { params(block: T.untyped).void }
+    sig { params(block: T.proc.params(bytes: String).void).void }
     def initialize(&block); end
     # Sends a string through to the block stored in the BlockWrite.
     #
     # _@param_ `buf` — the string to write. Note that a zero-length String will not be forwarded to the block, as it has special meaning when used with chunked encoding (it indicates the end of the stream).
-    #
-    # _@return_ — self
-    sig { params(buf: String).returns(T.untyped) }
+    sig { params(buf: String).returns(ZipKit::BlockWrite) }
     def <<(buf); end
+    # Writes the given data to the output stream. Allows the object to be used as
+    # a target for `IO.copy_stream(from, to)`
+    #
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
+    #
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
+    sig { params(bytes: String).returns(Fixnum) }
+    def write(bytes); end
   end
   # A very barebones ZIP file reader. Is made for maximum interoperability, but at the same
@@ -1657,13 +1662,12 @@ end, T.untyped)
     sig { params(crc32: Fixnum, blob_size: Fixnum).returns(Fixnum) }
     def append(crc32, blob_size); end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1728,13 +1732,12 @@ end, T.untyped)
   # "IO-ish" things to also respond to `write`? This is what this module does.
   # Jim would be proud. We miss you, Jim.
   module WriteShovel
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1960,13 +1963,12 @@ end, T.untyped)
     sig { returns(T.untyped) }
     def tell; end
-    # sord infer - argument name in single @param inferred as "bytes"
     # Writes the given data to the output stream. Allows the object to be used as
     # a target for `IO.copy_stream(from, to)`
     #
-    # _@param_ `d` — the binary string to write (part of the uncompressed file)
+    # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
     #
-    # _@return_ — the number of bytes written
+    # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
     sig { params(bytes: String).returns(Fixnum) }
     def write(bytes); end
   end
@@ -1977,25 +1979,39 @@ end, T.untyped)
     # gets automatically forwarded to the Rails response stream. When the output completes,
     # the Rails response stream is going to be closed automatically.
     #
+    # Note that there is an important difference in how this method works, depending whether
+    # you use it in a controller which includes `ActionController::Live` vs. one that does not.
+    # With a standard `ActionController` this method will assign a response body, but streaming
+    # will begin when your action method returns. With `ActionController::Live` the streaming
+    # will begin immediately, before the method returns. In all other aspects the method should
+    # stream correctly in both types of controllers.
+    #
+    # If you encounter buffering (streaming does not start for a very long time) you probably
+    # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
+    # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
+    # always possible. If you encounter buffering, examine your middleware stack and try to suss
+    # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
+    # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
+    #
     # _@param_ `filename` — name of the file for the Content-Disposition header
     #
     # _@param_ `type` — the content type (MIME type) of the archive being output
     #
     # _@param_ `use_chunked_transfer_encoding` — whether to forcibly encode output as chunked. Normally you should not need this.
     #
-    # _@param_ `zip_streamer_options` — options that will be passed to the Streamer. See {ZipKit::Streamer#initialize} for the full list of options.
+    # _@param_ `output_enumerator_options` — options that will be passed to the OutputEnumerator - these include options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
     #
-    # _@return_ — The output enumerator assigned to the response body
+    # _@return_ — always returns true
     sig do
       params(
         filename: String,
         type: String,
         use_chunked_transfer_encoding: T::Boolean,
-        zip_streamer_options: T::Hash[T.untyped, T.untyped],
+        output_enumerator_options: T::Hash[T.untyped, T.untyped],
         zip_streaming_blk: T.proc.params(the: ZipKit::Streamer).void
-      ).returns(ZipKit::OutputEnumerator)
+      ).returns(T::Boolean)
     end
-    def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk); end
+    def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk); end
   end
   # The output enumerator makes it possible to "pull" from a ZipKit streamer
@@ -2056,15 +2072,11 @@ end, T.untyped)
     #       ...
     #     end
     #
-    # _@param_ `kwargs_for_new` — keyword arguments for {Streamer.new}
-    #
     # _@param_ `streamer_options` — options for Streamer, see {ZipKit::Streamer.new}
     #
-    # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set it to 0 for unbuffered writes.
+    # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
     #
     # _@param_ `blk` — a block that will receive the Streamer object when executing. The block will not be executed immediately but only once `each` is called on the OutputEnumerator
-    #
-    # _@return_ — the enumerator you can read bytestrings of the ZIP from by calling `each`
     sig { params(write_buffer_size: Integer, streamer_options: T::Hash[T.untyped, T.untyped], blk: T.untyped).void }
     def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk); end
@@ -2083,6 +2095,18 @@ end, T.untyped)
     def each; end
     # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
+    # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
+    # particular response body getting served. You might want to override the headers with your particular
+    # ones - for example, specific content types are needed for files which are, technically, ZIP files
+    # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
+    # and ePubs.
+    sig { returns(T::Hash[T.untyped, T.untyped]) }
+    def self.streaming_http_headers; end
+    # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
+    # presized responses, but is now effectively a no-op.
+    #
+    # _@see_ `[ZipKit::OutputEnumerator.streaming_http_headers]`
     sig { returns(T::Hash[T.untyped, T.untyped]) }
     def streaming_http_headers; end

data/zip_kit.gemspec CHANGED Viewed

@@ -7,7 +7,7 @@ Gem::Specification.new do |spec|
   spec.version = ZipKit::VERSION
   spec.authors = ["Julik Tarkhanov", "Noah Berman", "Dmitry Tymchuk", "David Bosveld", "Felix Bünemann"]
   spec.email = ["me@julik.nl"]
+  spec.license = "MIT"
   spec.summary = "Stream out ZIP files from Ruby. Successor to zip_tricks."
   spec.description = "Stream out ZIP files from Ruby. Successor to zip_tricks."
   spec.homepage = "https://github.com/julik/zip_kit"
@@ -23,9 +23,12 @@ Gem::Specification.new do |spec|
   spec.require_paths = ["lib"]
   spec.add_development_dependency "bundler"
-  spec.add_development_dependency "rubyzip", "~> 1"
-  spec.add_development_dependency "rack" # For tests where we spin up a server
+  # zip_kit does not use any runtime dependencies (besides zlib). However, for testing
+  # things quite a few things are used - and for a good reason.
+  spec.add_development_dependency "rubyzip", "~> 1" # We test our output with _another_ ZIP library, which is the way to go here
+  spec.add_development_dependency "rack" # For tests where we spin up a server. Both for streaming out and for testing reads over HTTP
   spec.add_development_dependency "rake", "~> 12.2"
   spec.add_development_dependency "rspec", "~> 3"
   spec.add_development_dependency "rspec-mocks", "~> 3.10", ">= 3.10.2" # ruby 3 compatibility
@@ -39,5 +42,6 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency "puma"
   spec.add_development_dependency "actionpack", "~> 5" # For testing RailsStreaming against an actual Rails controller
   spec.add_development_dependency "nokogiri", "~> 1", ">= 1.13" # Rails 5 does by mistake use an older Nokogiri otherwise
+  spec.add_development_dependency "sinatra"
   spec.add_development_dependency "sord"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: zip_kit
 version: !ruby/object:Gem::Version
-  version: 6.2.0
+  version: 6.2.2
 platform: ruby
 authors:
 - Julik Tarkhanov
@@ -12,7 +12,7 @@ authors:
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2024-03-11 00:00:00.000000000 Z
+date: 2024-03-27 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -250,6 +250,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '1.13'
+- !ruby/object:Gem::Dependency
+  name: sinatra
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 - !ruby/object:Gem::Dependency
   name: sord
   requirement: !ruby/object:Gem::Requirement
@@ -292,6 +306,7 @@ files:
 - examples/parallel_compression_with_block_deflate.rb
 - examples/rack_application.rb
 - examples/s3_upload.rb
+- examples/sinatra_application.rb
 - lib/zip_kit.rb
 - lib/zip_kit/block_deflate.rb
 - lib/zip_kit/block_write.rb
@@ -324,7 +339,8 @@ files:
 - rbi/zip_kit.rbi
 - zip_kit.gemspec
 homepage: https://github.com/julik/zip_kit
-licenses: []
+licenses:
+- MIT
 metadata:
   allowed_push_host: https://rubygems.org
 post_install_message:
@@ -342,7 +358,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.3.7
+rubygems_version: 3.1.6
 signing_key:
 specification_version: 4
 summary: Stream out ZIP files from Ruby. Successor to zip_tricks.