RubyGems - zip_kit - Versions diffs - 6.0.1 → 6.2.0 - Mend

zip_kit 6.0.1 → 6.2.0

Files changed (18) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +8 -0
data/README.md +11 -11
data/Rakefile +10 -2
data/lib/zip_kit/file_reader.rb +1 -1
data/lib/zip_kit/output_enumerator.rb +19 -37
data/lib/zip_kit/rack_tempfile_body.rb +3 -1
data/lib/zip_kit/rails_streaming.rb +13 -6
data/lib/zip_kit/remote_io.rb +2 -0
data/lib/zip_kit/size_estimator.rb +3 -1
data/lib/zip_kit/streamer/heuristic.rb +3 -8
data/lib/zip_kit/streamer.rb +12 -15
data/lib/zip_kit/version.rb +1 -1
data/lib/zip_kit/zip_writer.rb +182 -106
data/rbi/zip_kit.rbi +2171 -0
data/zip_kit.gemspec +1 -0
metadata +18 -4
data/.codeclimate.yml +0 -7

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 01e5705a5a5fd365b524814f91aaca754b983882a1ac596bb0b419fcb8be78d1
-  data.tar.gz: 96a01fdbe7ce804c88d8b56ab6319c2e4d67b6b656d5babb7abecfccde2634db
+  metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
+  data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
 SHA512:
-  metadata.gz: 14f84e439aa7f1dfde017fed0209f8e7d8cb23baf3cf6025d7b0321fe289d51bbdcd8f40a159dd5dbb43eb6869fb79457de04be98e26deea2e8a04765381ef3a
-  data.tar.gz: 111a3b10851f30eea6a1c9981b68a021f611b485c99cd9d69239dbe9a40645fcdfd64fc8c8f76f1ae2d607b6bb6e8e8d2ca2a233411bf330ed226c0d0493e4d3
+  metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
+  data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,11 @@
+## 6.2.0
+* Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
+## 6.1.0
+* Add Sorbet `.rbi` for type hints and resolution. This should make developing with zip_kit more pleasant, and the library - more discoverable.
 ## 6.0.1
 * Fix `require` for the `VERSION` constant, as Zeitwerk would try to resolve it in Rails context, bringing the entire module under its reloading.

data/README.md CHANGED Viewed

@@ -1,5 +1,8 @@
 # zip_kit
+[![Tests](https://github.com/julik/zip_kit/actions/workflows/ci.yml/badge.svg)](https://github.com/julik/zip_kit/actions/workflows/ci.yml)
+[![Gem Version](https://badge.fury.io/rb/zip_kit.svg)](https://badge.fury.io/rb/zip_kit)
 Allows streaming, non-rewinding ZIP file output from Ruby.
 `zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
@@ -56,7 +59,7 @@ via HTTP.
 and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
 RSpec) should work normally with controller actions returning ZIPs.
-## Writing into other streaming destinations
+## Writing into other streaming destinations and through streaming wrappers
 Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
 is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -78,8 +81,6 @@ obj.upload_stream do |write_stream|
 end
 ```
-# Writing through an intermediary object
 Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
 output with [builder](https://github.com/jimweirich/builder#project-builder)
@@ -101,11 +102,11 @@ Ruby code that streams its output into a destination.
 Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
 memory inflation is going to be very constrained. Data will be written to destination at fairly regular
-intervals. Deflate compression will work best for things like text files.
+intervals. Deflate compression will work best for things like text files. For example, here is how to
+output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
 ```ruby
-out = my_tempfile # can also be a socket
-ZipKit::Streamer.open(out) do |zip|
+ZipKit::Streamer.open($stdout) do |zip|
   zip.write_file('mov.mp4.txt') do |sink|
     File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
   end
@@ -114,14 +115,16 @@ ZipKit::Streamer.open(out) do |zip|
   end
 end
 ```
 Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
 since you do not know how large the compressed data segments are going to be.
 ## Send a ZIP from a Rack response
 zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
-by piece, and apply some amount of buffering as well. Make sure to also wrap your `OutputEnumerator` in a chunker
-by calling `#to_chunked` on it. Return it to your webserver and you will have your ZIP streamed!
+by piece, and apply some amount of buffering as well. Note that you might want to wrap
+it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
+that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
 The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
 start executing once your response body starts getting iterated over - when actually sending
 the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
@@ -145,13 +148,11 @@ headers, streaming_body = body.to_rack_response_headers_and_body(env)
 Use the `SizeEstimator` to compute the correct size of the resulting archive.
 ```ruby
-# Precompute the Content-Length ahead of time
 bytesize = ZipKit::SizeEstimator.estimate do |z|
  z.add_stored_entry(filename: 'myfile1.bin', size: 9090821)
  z.add_stored_entry(filename: 'myfile2.bin', size: 458678)
 end
-# Prepare the response body. The block will only be called when the response starts to be written.
 zip_body = ZipKit::OutputEnumerator.new do | zip |
   zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
   zip << read_file('myfile1.bin')
@@ -171,7 +172,6 @@ the metadata of the file upfront (the CRC32 of the uncompressed file and the siz
 to that socket using some accelerated writing technique, and only use the Streamer to write out the ZIP metadata.
 ```ruby
-# io has to be an object that supports #<< or #write()
 ZipKit::Streamer.open(io) do | zip |
   # raw_file is written "as is" (STORED mode).
   # Write the local file header first..

data/Rakefile CHANGED Viewed

@@ -16,6 +16,14 @@ YARD::Rake::YardocTask.new(:doc) do |t|
   # miscellaneous documentation files that contain no code
   t.files = ["lib/**/*.rb", "-", "LICENSE.txt", "IMPLEMENTATION_DETAILS.md"]
 end
 RSpec::Core::RakeTask.new(:spec)
-task default: [:spec, :standard]
+task :generate_typedefs do
+  `bundle exec sord rbi/zip_kit.rbi`
+end
+task default: [:spec, :standard, :generate_typedefs]
+# When building the gem, generate typedefs beforehand,
+# so that they get included
+Rake::Task["build"].enhance(["generate_typedefs"])

data/lib/zip_kit/file_reader.rb CHANGED Viewed

@@ -137,7 +137,7 @@ class ZipKit::FileReader
     #   reader = entry.extractor_from(source_file)
     #   outfile << reader.extract(512 * 1024) until reader.eof?
     #
-    # @return [#extract(n_bytes), #eof?] the reader for the data
+    # @return [StoredReader,InflatingReader] the reader for the data
     def extractor_from(from_io)
       from_io.seek(compressed_data_offset, IO::SEEK_SET)
       case storage_mode

data/lib/zip_kit/output_enumerator.rb CHANGED Viewed

@@ -28,15 +28,11 @@ require "time" # for .httpdate
 #       end
 #     end
 #
-# Either as a `Transfer-Encoding: chunked` response (if your webserver supports it),
-# which will give you true streaming capability:
+# You can grab the headers one usually needs for streaming from `#streaming_http_headers`:
 #
-#     headers, chunked_or_presized_rack_body = iterable_zip_body.to_headers_and_rack_response_body(env)
-#     [200, headers, chunked_or_presized_rack_body]
+#     [200, iterable_zip_body.streaming_http_headers, iterable_zip_body]
 #
-# or it will wrap your output in a `TempfileBody` object which buffers the ZIP before output. Buffering has
-# benefits if your webserver does not support anything beyound HTTP/1.0, and also engages automatically
-# in unit tests (since rack-test and Rails tests do not do streaming HTTP/1.1).
+# to bypass things like `Rack::ETag` and the nginx buffering.
 class ZipKit::OutputEnumerator
   DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
@@ -103,17 +99,11 @@ class ZipKit::OutputEnumerator
     end
   end
-  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
-  # an object that can be used as a Rack response body. The method will automatically
-  # switch the wrapping of the output depending on whether the response can be pre-sized,
-  # and whether your downstream webserver (like nginx) is configured to support
-  # the HTTP/1.1 protocol version.
+  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
   #
-  # @param rack_env[Hash] the Rack env, which the method may need to mutate (adding a Tempfile for cleanup)
-  # @param content_length[Integer] the amount of bytes that the archive will contain. If given, no Chunked encoding gets applied.
-  # @return [Array]
-  def to_headers_and_rack_response_body(rack_env, content_length: nil)
-    headers = {
+  # @return [Hash]
+  def streaming_http_headers
+    _headers = {
       # We need to ensure Rack::ETag does not suddenly start buffering us, see
       # https://github.com/rack/rack/issues/1619#issuecomment-606315714
       # Set this even when not streaming for consistency. The fact that there would be
@@ -124,27 +114,19 @@ class ZipKit::OutputEnumerator
       "Content-Encoding" => "identity",
       # Disable buffering for both nginx and Google Load Balancer, see
       # https://cloud.google.com/appengine/docs/flexible/how-requests-are-handled?tab=python#x-accel-buffering
-      "X-Accel-Buffering" => "no"
+      "X-Accel-Buffering" => "no",
+      # Set the correct content type. This should be overridden if you need to
+      # serve things such as EPubs and other derived ZIP formats.
+      "Content-Type" => "application/zip"
     }
+  end
-    if content_length
-      # If we know the size of the body, transfer encoding is not required at all - so the enumerator itself
-      # can function as the Rack body. This also would apply in HTTP/2 contexts where chunked encoding would
-      # no longer be required - then the enumerator could get returned "bare".
-      body = self
-      headers["Content-Length"] = content_length.to_i.to_s
-    elsif rack_env["HTTP_VERSION"] == "HTTP/1.0"
-      # Check for the proxy configuration first. This is the first common misconfiguration which destroys streaming -
-      # since HTTP 1.0 does not support chunked responses we need to revert to buffering. The issue though is that
-      # this reversion happens silently and it is usually not clear at all why streaming does not work. So let's at
-      # the very least print it to the Rails log.
-      body = ZipKit::RackTempfileBody.new(rack_env, self)
-      headers["Content-Length"] = body.size.to_s
-    else
-      body = ZipKit::RackChunkedBody.new(self)
-      headers["Transfer-Encoding"] = "chunked"
-    end
-    [headers, body]
+  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
+  # an object that can be used as a Rack response body. This method used to accept arguments
+  # but will now just ignore them.
+  #
+  # @return [Array]
+  def to_headers_and_rack_response_body(*, **)
+    [streaming_http_headers, self]
   end
 end

data/lib/zip_kit/rack_tempfile_body.rb CHANGED Viewed

@@ -1,7 +1,9 @@
 # frozen_string_literal: true
 # Contains a file handle which can be closed once the response finishes sending.
-# It supports `to_path` so that `Rack::Sendfile` can intercept it
+# It supports `to_path` so that `Rack::Sendfile` can intercept it.
+# This class is deprecated and is going to be removed in zip_kit 7.x
+# @api deprecated
 class ZipKit::RackTempfileBody
   TEMPFILE_NAME_PREFIX = "zip-tricks-tf-body-"
   attr_reader :tempfile

data/lib/zip_kit/rails_streaming.rb CHANGED Viewed

@@ -7,14 +7,15 @@ module ZipKit::RailsStreaming
   # the Rails response stream is going to be closed automatically.
   # @param filename[String] name of the file for the Content-Disposition header
   # @param type[String] the content type (MIME type) of the archive being output
+  # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
   # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
   #     See {ZipKit::Streamer#initialize} for the full list of options.
-  # @yield [Streamer] the streamer that can be written to
+  # @yieldparam [ZipKit::Streamer] the streamer that can be written to
   # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
-  def zip_kit_stream(filename: "download.zip", type: "application/zip", **zip_streamer_options, &zip_streaming_blk)
+  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
     # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
     # first will also validate the Streamer options.
-    chunk_yielder = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
+    output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
     # We want some common headers for file sending. Rails will also set
     # self.sending_file = true for us when we call send_file_headers!
@@ -28,10 +29,16 @@ module ZipKit::RailsStreaming
       logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
     end
-    headers, rack_body = chunk_yielder.to_headers_and_rack_response_body(request.env)
+    headers = output_enum.streaming_http_headers
+    # In rare circumstances (such as the app using Rack::ContentLength - which should normally
+    # not be used allow the user to force the use of the chunked encoding
+    if use_chunked_transfer_encoding
+      output_enum = ZipKit::RackChunkedBody.new(output_enum)
+      headers["Transfer-Encoding"] = "chunked"
+    end
-    # Set the "particular" streaming headers
     response.headers.merge!(headers)
-    self.response_body = rack_body
+    self.response_body = output_enum
   end
 end

data/lib/zip_kit/remote_io.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "uri"
 # An object that fakes just-enough of an IO to be dangerous
 # - or, more precisely, to be useful as a source for the FileReader
 # central directory parser. Effectively we substitute an IO object

data/lib/zip_kit/size_estimator.rb CHANGED Viewed

@@ -6,6 +6,8 @@ class ZipKit::SizeEstimator
   # Creates a new estimator with a Streamer object. Normally you should use
   # `estimate` instead an not use this method directly.
+  #
+  # @param streamer[ZipKit::Streamer]
   def initialize(streamer)
     @streamer = streamer
   end
@@ -22,7 +24,7 @@ class ZipKit::SizeEstimator
   #
   # @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
   # @return [Integer] the size of the resulting archive, in bytes
-  # @yield [SizeEstimator] the estimator
+  # @yieldparam [SizeEstimator] the estimator
   def self.estimate(**kwargs_for_streamer_new)
     streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
     estimator = new(streamer)

data/lib/zip_kit/streamer/heuristic.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "zlib"
 # Will be used to pick whether to store a file in the `stored` or
 # `deflated` mode, by compressing the first N bytes of the file and
 # comparing the stored and deflated data sizes. If deflate produces
@@ -10,9 +12,7 @@
 # Heuristic will call either `write_stored_file` or `write_deflated_file`
 # on the Streamer passed into it once it knows which compression
 # method should be applied
-class ZipKit::Streamer::Heuristic
-  include ZipKit::WriteShovel
+class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
   BYTES_WRITTEN_THRESHOLD = 128 * 1024
   MINIMUM_VIABLE_COMPRESSION = 0.75
@@ -39,11 +39,6 @@ class ZipKit::Streamer::Heuristic
     self
   end
-  def write(bytes)
-    self << bytes
-    bytes.bytesize
-  end
   def close
     decide unless @winner
     @winner.close

data/lib/zip_kit/streamer.rb CHANGED Viewed

@@ -169,7 +169,7 @@ class ZipKit::Streamer
   # @param uncompressed_size [Integer] the size of the entry when uncompressed, in bytes
   # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
   # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_deflated_entry(filename:, modification_time: Time.now.utc, compressed_size: 0, uncompressed_size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
     add_file_and_write_local_header(filename: filename,
@@ -193,7 +193,7 @@ class ZipKit::Streamer
   # @param size [Integer] the size of the file when uncompressed, in bytes
   # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
   # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor. When in use
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_stored_entry(filename:, modification_time: Time.now.utc, size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
     add_file_and_write_local_header(filename: filename,
@@ -211,7 +211,7 @@ class ZipKit::Streamer
   #
   # @param dirname [String] the name of the directory in the archive
   # @param modification_time [Time] the modification time of the directory in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_empty_directory(dirname:, modification_time: Time.now.utc, unix_permissions: nil)
     add_file_and_write_local_header(filename: dirname.to_s + "/",
@@ -262,13 +262,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     writable = ZipKit::Streamer::Heuristic.new(self, filename, modification_time: modification_time, unix_permissions: unix_permissions)
     yield_or_return_writable(writable, &blk)
@@ -313,13 +312,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_stored_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     add_stored_entry(filename: filename,
       modification_time: modification_time,
@@ -373,13 +371,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_deflated_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     add_deflated_entry(filename: filename,
       modification_time: modification_time,

data/lib/zip_kit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module ZipKit
-  VERSION = "6.0.1"
+  VERSION = "6.2.0"
 end