RubyGems - zip_kit - Versions diffs - 6.0.1 → 6.2.1 - Mend

zip_kit 6.0.1 → 6.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +14 -1
data/CONTRIBUTING.md +2 -2
data/README.md +36 -27
data/Rakefile +10 -2
data/lib/zip_kit/block_write.rb +4 -1
data/lib/zip_kit/file_reader.rb +1 -1
data/lib/zip_kit/output_enumerator.rb +33 -40
data/lib/zip_kit/rack_tempfile_body.rb +3 -1
data/lib/zip_kit/rails_streaming.rb +36 -10
data/lib/zip_kit/remote_io.rb +2 -0
data/lib/zip_kit/size_estimator.rb +3 -1
data/lib/zip_kit/streamer/heuristic.rb +3 -8
data/lib/zip_kit/streamer.rb +23 -26
data/lib/zip_kit/version.rb +1 -1
data/lib/zip_kit/write_shovel.rb +2 -2
data/lib/zip_kit/zip_writer.rb +182 -106
data/rbi/zip_kit.rbi +2181 -0
data/zip_kit.gemspec +7 -3
metadata +20 -5
data/.codeclimate.yml +0 -7

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 01e5705a5a5fd365b524814f91aaca754b983882a1ac596bb0b419fcb8be78d1
-  data.tar.gz: 96a01fdbe7ce804c88d8b56ab6319c2e4d67b6b656d5babb7abecfccde2634db
+  metadata.gz: f1d33b58f4501d3ddbae7abcab3957fde0549abf734eae72ca1a7ce45601f479
+  data.tar.gz: e9126924e6fe75329237ba551a1a65218676c7d2b3757f4ad91e73eb0bce154e
 SHA512:
-  metadata.gz: 14f84e439aa7f1dfde017fed0209f8e7d8cb23baf3cf6025d7b0321fe289d51bbdcd8f40a159dd5dbb43eb6869fb79457de04be98e26deea2e8a04765381ef3a
-  data.tar.gz: 111a3b10851f30eea6a1c9981b68a021f611b485c99cd9d69239dbe9a40645fcdfd64fc8c8f76f1ae2d607b6bb6e8e8d2ca2a233411bf330ed226c0d0493e4d3
+  metadata.gz: 011e57f856ebe7f625b0bfa5eeb4a240c6c38f2b07ff0434b7e89805516b2d47b6d4230ac404203d00562586909c72d7ea225a7238d47af70fb99ed97b3d50bc
+  data.tar.gz: b68fbaae2e57314c47e7971aeef2150341ba80308c7dee1536718250f5cac01b4b387fe3f73cde5f84fb1ffcd93500343ec451d0fccf9f19b2c0c4a58e74aa2f

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,16 @@
+## 6.2.1
+* Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
+* Make `BlockWrite` respond to `write` in addition to `<<`
+## 6.2.0
+* Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
+## 6.1.0
+* Add Sorbet `.rbi` for type hints and resolution. This should make developing with zip_kit more pleasant, and the library - more discoverable.
 ## 6.0.1
 * Fix `require` for the `VERSION` constant, as Zeitwerk would try to resolve it in Rails context, bringing the entire module under its reloading.
@@ -141,7 +154,7 @@
 ## 4.4.2
 * Add 2.4 to Travis rubies
-* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_kit/pull/14)
+* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
 ## 4.4.1

data/CONTRIBUTING.md CHANGED Viewed

@@ -106,11 +106,11 @@ project:
    ```bash
    # Clone your fork of the repo into the current directory
-   git clone git@github.com:WeTransfer/zip_kit.git
+   git clone git@github.com:julik/zip_kit.git
    # Navigate to the newly cloned directory
    cd zip_kit
    # Assign the original repo to a remote called "upstream"
-   git remote add upstream git@github.com:WeTransfer/zip_kit.git
+   git remote add upstream git@github.com:julik/zip_kit.git
    ```
 2. If you cloned a while ago, get the latest changes from upstream:

data/README.md CHANGED Viewed

@@ -1,5 +1,8 @@
 # zip_kit
+[![Tests](https://github.com/julik/zip_kit/actions/workflows/ci.yml/badge.svg)](https://github.com/julik/zip_kit/actions/workflows/ci.yml)
+[![Gem Version](https://badge.fury.io/rb/zip_kit.svg)](https://badge.fury.io/rb/zip_kit)
 Allows streaming, non-rewinding ZIP file output from Ruby.
 `zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
@@ -56,7 +59,7 @@ via HTTP.
 and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
 RSpec) should work normally with controller actions returning ZIPs.
-## Writing into other streaming destinations
+## Writing into streaming destinations
 Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
 is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -66,25 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
 obj = bucket.object("big.zip")
 obj.upload_stream do |write_stream|
   ZipKit::Streamer.open(write_stream) do |zip|
-    zip.write_file("large.csv") do |sink|
-      CSV(sink) do |csv|
-        csv << ["Line", "Item"]
-        20_000.times do |n|
-          csv << [n, "Item number #{n}"]
-        end
+    zip.write_file("file.csv") do |sink|
+      File.open("large.csv", "rb") do |file_input|
+        IO.copy_stream(file_input, sink)
       end
     end
   end
 end
 ```
-# Writing through an intermediary object
+## Writing through streaming wrappers
 Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
-output with [builder](https://github.com/jimweirich/builder#project-builder)
+output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
+every time a complete write call is done:
 ```ruby
-zip.write_file('report1.csv') do |sink|
+zip.write_file('employees.xml') do |sink|
   builder = Builder::XmlMarkup.new(target: sink, indent: 2)
   builder.people do
     Person.all.find_each do |person|
@@ -94,18 +95,28 @@ zip.write_file('report1.csv') do |sink|
 end
 ```
-and this output will be compressed and output into the ZIP file on the fly. zip_kit composes with any
-Ruby code that streams its output into a destination.
+The output will be compressed and output into the ZIP file on the fly. Same for CSV:
+```ruby
+zip.write_file('line_items.csv') do |sink|
+  CSV(sink) do |csv|
+    csv << ["Line", "Item"]
+    20_000.times do |n|
+      csv << [n, "Item number #{n}"]
+    end
+  end
+end
+```
 ## Create a ZIP file without size estimation, compress on-the-fly during writes
 Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
 memory inflation is going to be very constrained. Data will be written to destination at fairly regular
-intervals. Deflate compression will work best for things like text files.
+intervals. Deflate compression will work best for things like text files. For example, here is how to
+output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
 ```ruby
-out = my_tempfile # can also be a socket
-ZipKit::Streamer.open(out) do |zip|
+ZipKit::Streamer.open($stdout) do |zip|
   zip.write_file('mov.mp4.txt') do |sink|
     File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
   end
@@ -114,17 +125,17 @@ ZipKit::Streamer.open(out) do |zip|
   end
 end
 ```
 Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
 since you do not know how large the compressed data segments are going to be.
 ## Send a ZIP from a Rack response
 zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
-by piece, and apply some amount of buffering as well. Make sure to also wrap your `OutputEnumerator` in a chunker
-by calling `#to_chunked` on it. Return it to your webserver and you will have your ZIP streamed!
-The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
-start executing once your response body starts getting iterated over - when actually sending
-the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
+by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
+and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
+the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
+over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
 ```ruby
 body = ZipKit::OutputEnumerator.new do | zip |
@@ -136,8 +147,7 @@ body = ZipKit::OutputEnumerator.new do | zip |
   end
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env)
-[200, headers, streaming_body]
+[200, body.streaming_http_headers, body]
 ```
 ## Send a ZIP file of known size, with correct headers
@@ -145,13 +155,11 @@ headers, streaming_body = body.to_rack_response_headers_and_body(env)
 Use the `SizeEstimator` to compute the correct size of the resulting archive.
 ```ruby
-# Precompute the Content-Length ahead of time
 bytesize = ZipKit::SizeEstimator.estimate do |z|
  z.add_stored_entry(filename: 'myfile1.bin', size: 9090821)
  z.add_stored_entry(filename: 'myfile2.bin', size: 458678)
 end
-# Prepare the response body. The block will only be called when the response starts to be written.
 zip_body = ZipKit::OutputEnumerator.new do | zip |
   zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
   zip << read_file('myfile1.bin')
@@ -159,8 +167,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
   zip << read_file('myfile2.bin')
 end
-headers, streaming_body = body.to_rack_response_headers_and_body(env, content_length: bytesize)
-[200, headers, streaming_body]
+hh = zip_body.streaming_http_headers
+hh["Content-Length"] = bytesize.to_s
+[200, hh, zip_body]
 ```
 ## Writing ZIP files using the Streamer bypass
@@ -171,7 +181,6 @@ the metadata of the file upfront (the CRC32 of the uncompressed file and the siz
 to that socket using some accelerated writing technique, and only use the Streamer to write out the ZIP metadata.
 ```ruby
-# io has to be an object that supports #<< or #write()
 ZipKit::Streamer.open(io) do | zip |
   # raw_file is written "as is" (STORED mode).
   # Write the local file header first..

data/Rakefile CHANGED Viewed

@@ -16,6 +16,14 @@ YARD::Rake::YardocTask.new(:doc) do |t|
   # miscellaneous documentation files that contain no code
   t.files = ["lib/**/*.rb", "-", "LICENSE.txt", "IMPLEMENTATION_DETAILS.md"]
 end
 RSpec::Core::RakeTask.new(:spec)
-task default: [:spec, :standard]
+task :generate_typedefs do
+  `bundle exec sord rbi/zip_kit.rbi`
+end
+task default: [:spec, :standard, :generate_typedefs]
+# When building the gem, generate typedefs beforehand,
+# so that they get included
+Rake::Task["build"].enhance(["generate_typedefs"])

data/lib/zip_kit/block_write.rb CHANGED Viewed

@@ -17,9 +17,12 @@
 #     end
 #     [200, {}, MyRackResponse.new]
 class ZipKit::BlockWrite
+  include ZipKit::WriteShovel
   # Creates a new BlockWrite.
   #
   # @param block The block that will be called when this object receives the `<<` message
+  # @yieldparam bytes[String] A string in binary encoding which has just been written into the object
   def initialize(&block)
     @block = block
   end
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
   # @param buf[String] the string to write. Note that a zero-length String
   #    will not be forwarded to the block, as it has special meaning when used
   #    with chunked encoding (it indicates the end of the stream).
-  # @return self
+  # @return [ZipKit::BlockWrite]
   def <<(buf)
     # Zero-size output has a special meaning  when using chunked encoding
     return if buf.nil? || buf.bytesize.zero?

data/lib/zip_kit/file_reader.rb CHANGED Viewed

@@ -137,7 +137,7 @@ class ZipKit::FileReader
     #   reader = entry.extractor_from(source_file)
     #   outfile << reader.extract(512 * 1024) until reader.eof?
     #
-    # @return [#extract(n_bytes), #eof?] the reader for the data
+    # @return [StoredReader,InflatingReader] the reader for the data
     def extractor_from(from_io)
       from_io.seek(compressed_data_offset, IO::SEEK_SET)
       case storage_mode

data/lib/zip_kit/output_enumerator.rb CHANGED Viewed

@@ -28,15 +28,11 @@ require "time" # for .httpdate
 #       end
 #     end
 #
-# Either as a `Transfer-Encoding: chunked` response (if your webserver supports it),
-# which will give you true streaming capability:
+# You can grab the headers one usually needs for streaming from `#streaming_http_headers`:
 #
-#     headers, chunked_or_presized_rack_body = iterable_zip_body.to_headers_and_rack_response_body(env)
-#     [200, headers, chunked_or_presized_rack_body]
+#     [200, iterable_zip_body.streaming_http_headers, iterable_zip_body]
 #
-# or it will wrap your output in a `TempfileBody` object which buffers the ZIP before output. Buffering has
-# benefits if your webserver does not support anything beyound HTTP/1.0, and also engages automatically
-# in unit tests (since rack-test and Rails tests do not do streaming HTTP/1.1).
+# to bypass things like `Rack::ETag` and the nginx buffering.
 class ZipKit::OutputEnumerator
   DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
@@ -64,14 +60,11 @@ class ZipKit::OutputEnumerator
   #       ...
   #     end
   #
-  # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
-  # @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
-  #
   # @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
   # @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
   #     it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
   #     object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
-  #     but at block size boundaries or greater). Set it to 0 for unbuffered writes.
+  #     but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
   # @param blk a block that will receive the Streamer object when executing. The block will not be executed
   #     immediately but only once `each` is called on the OutputEnumerator
   def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
@@ -103,17 +96,16 @@ class ZipKit::OutputEnumerator
     end
   end
-  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
-  # an object that can be used as a Rack response body. The method will automatically
-  # switch the wrapping of the output depending on whether the response can be pre-sized,
-  # and whether your downstream webserver (like nginx) is configured to support
-  # the HTTP/1.1 protocol version.
+  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
+  # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
+  # particular response body getting served. You might want to override the headers with your particular
+  # ones - for example, specific content types are needed for files which are, technically, ZIP files
+  # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
+  # and ePubs.
   #
-  # @param rack_env[Hash] the Rack env, which the method may need to mutate (adding a Tempfile for cleanup)
-  # @param content_length[Integer] the amount of bytes that the archive will contain. If given, no Chunked encoding gets applied.
-  # @return [Array]
-  def to_headers_and_rack_response_body(rack_env, content_length: nil)
-    headers = {
+  # @return [Hash]
+  def self.streaming_http_headers
+    _headers = {
       # We need to ensure Rack::ETag does not suddenly start buffering us, see
       # https://github.com/rack/rack/issues/1619#issuecomment-606315714
       # Set this even when not streaming for consistency. The fact that there would be
@@ -124,27 +116,28 @@ class ZipKit::OutputEnumerator
       "Content-Encoding" => "identity",
       # Disable buffering for both nginx and Google Load Balancer, see
       # https://cloud.google.com/appengine/docs/flexible/how-requests-are-handled?tab=python#x-accel-buffering
-      "X-Accel-Buffering" => "no"
+      "X-Accel-Buffering" => "no",
+      # Set the correct content type. This should be overridden if you need to
+      # serve things such as EPubs and other derived ZIP formats.
+      "Content-Type" => "application/zip"
     }
+  end
-    if content_length
-      # If we know the size of the body, transfer encoding is not required at all - so the enumerator itself
-      # can function as the Rack body. This also would apply in HTTP/2 contexts where chunked encoding would
-      # no longer be required - then the enumerator could get returned "bare".
-      body = self
-      headers["Content-Length"] = content_length.to_i.to_s
-    elsif rack_env["HTTP_VERSION"] == "HTTP/1.0"
-      # Check for the proxy configuration first. This is the first common misconfiguration which destroys streaming -
-      # since HTTP 1.0 does not support chunked responses we need to revert to buffering. The issue though is that
-      # this reversion happens silently and it is usually not clear at all why streaming does not work. So let's at
-      # the very least print it to the Rails log.
-      body = ZipKit::RackTempfileBody.new(rack_env, self)
-      headers["Content-Length"] = body.size.to_s
-    else
-      body = ZipKit::RackChunkedBody.new(self)
-      headers["Transfer-Encoding"] = "chunked"
-    end
+  # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
+  # presized responses, but is now effectively a no-op.
+  #
+  # @see [ZipKit::OutputEnumerator.streaming_http_headers]
+  # @return [Hash]
+  def streaming_http_headers
+    self.class.streaming_http_headers
+  end
-    [headers, body]
+  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
+  # an object that can be used as a Rack response body. This method used to accept arguments
+  # but will now just ignore them.
+  #
+  # @return [Array]
+  def to_headers_and_rack_response_body(*, **)
+    [streaming_http_headers, self]
   end
 end

data/lib/zip_kit/rack_tempfile_body.rb CHANGED Viewed

@@ -1,7 +1,9 @@
 # frozen_string_literal: true
 # Contains a file handle which can be closed once the response finishes sending.
-# It supports `to_path` so that `Rack::Sendfile` can intercept it
+# It supports `to_path` so that `Rack::Sendfile` can intercept it.
+# This class is deprecated and is going to be removed in zip_kit 7.x
+# @api deprecated
 class ZipKit::RackTempfileBody
   TEMPFILE_NAME_PREFIX = "zip-tricks-tf-body-"
   attr_reader :tempfile

data/lib/zip_kit/rails_streaming.rb CHANGED Viewed

@@ -7,15 +7,12 @@ module ZipKit::RailsStreaming
   # the Rails response stream is going to be closed automatically.
   # @param filename[String] name of the file for the Content-Disposition header
   # @param type[String] the content type (MIME type) of the archive being output
+  # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
   # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
   #     See {ZipKit::Streamer#initialize} for the full list of options.
-  # @yield [Streamer] the streamer that can be written to
+  # @yieldparam [ZipKit::Streamer] the streamer that can be written to
   # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
-  def zip_kit_stream(filename: "download.zip", type: "application/zip", **zip_streamer_options, &zip_streaming_blk)
-    # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
-    # first will also validate the Streamer options.
-    chunk_yielder = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
+  def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
     # We want some common headers for file sending. Rails will also set
     # self.sending_file = true for us when we call send_file_headers!
     send_file_headers!(type: type, filename: filename)
@@ -28,10 +25,39 @@ module ZipKit::RailsStreaming
       logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
     end
-    headers, rack_body = chunk_yielder.to_headers_and_rack_response_body(request.env)
-    # Set the "particular" streaming headers
+    headers = ZipKit::OutputEnumerator.streaming_http_headers
     response.headers.merge!(headers)
-    self.response_body = rack_body
+    # The output enumerator yields chunks of bytes generated from the Streamer,
+    # with some buffering
+    output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
+    # Time for some branching, which mostly has to do with the 999 flavours of
+    # "how to make both Rails and Rack stream"
+    if self.class.ancestors.include?(ActionController::Live)
+      # If this controller includes Live it will not work correctly with a Rack
+      # response body assignment - we need to write into the Live output stream instead
+      begin
+        output_enum.each { |bytes| response.stream.write(bytes) }
+      ensure
+        response.stream.close
+      end
+    elsif use_chunked_transfer_encoding
+      # Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
+      # Rack::ContentLength is normally not in a Rails middleware stack, but it might get
+      # introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
+      # There is a recommendation to leave the chunked encoding to the app server, so that servers
+      # that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
+      # see https://github.com/julik/zip_kit/issues/7
+      # But it is not to be excluded that a user may need to force the chunked encoding to bypass
+      # some especially pesky Rack middleware that just would not cooperate. Those include
+      # Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
+      response.headers["Transfer-Encoding"] = "chunked"
+      self.response_body = ZipKit::RackChunkedBody.new(output_enum)
+    else
+      # Stream using a Rack body assigned to the ActionController response body, without
+      # doing explicit chunked encoding. See above for the reasoning.
+      self.response_body = output_enum
+    end
   end
 end

data/lib/zip_kit/remote_io.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "uri"
 # An object that fakes just-enough of an IO to be dangerous
 # - or, more precisely, to be useful as a source for the FileReader
 # central directory parser. Effectively we substitute an IO object

data/lib/zip_kit/size_estimator.rb CHANGED Viewed

@@ -6,6 +6,8 @@ class ZipKit::SizeEstimator
   # Creates a new estimator with a Streamer object. Normally you should use
   # `estimate` instead an not use this method directly.
+  #
+  # @param streamer[ZipKit::Streamer]
   def initialize(streamer)
     @streamer = streamer
   end
@@ -22,7 +24,7 @@ class ZipKit::SizeEstimator
   #
   # @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
   # @return [Integer] the size of the resulting archive, in bytes
-  # @yield [SizeEstimator] the estimator
+  # @yieldparam [SizeEstimator] the estimator
   def self.estimate(**kwargs_for_streamer_new)
     streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
     estimator = new(streamer)

data/lib/zip_kit/streamer/heuristic.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "zlib"
 # Will be used to pick whether to store a file in the `stored` or
 # `deflated` mode, by compressing the first N bytes of the file and
 # comparing the stored and deflated data sizes. If deflate produces
@@ -10,9 +12,7 @@
 # Heuristic will call either `write_stored_file` or `write_deflated_file`
 # on the Streamer passed into it once it knows which compression
 # method should be applied
-class ZipKit::Streamer::Heuristic
-  include ZipKit::WriteShovel
+class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
   BYTES_WRITTEN_THRESHOLD = 128 * 1024
   MINIMUM_VIABLE_COMPRESSION = 0.75
@@ -39,11 +39,6 @@ class ZipKit::Streamer::Heuristic
     self
   end
-  def write(bytes)
-    self << bytes
-    bytes.bytesize
-  end
   def close
     decide unless @winner
     @winner.close

data/lib/zip_kit/streamer.rb CHANGED Viewed

@@ -2,19 +2,19 @@
 require "set"
-# Is used to write streamed ZIP archives into the provided IO-ish object.
-# The output IO is never going to be rewound or seeked, so the output
-# of this object can be coupled directly to, say, a Rack output. The
-# output can also be a String, Array or anything that responds to `<<`.
+# Is used to write ZIP archives without having to read them back or to overwrite
+# data. It outputs into any object that supports `<<` or `write`, namely:
 #
-# Allows for splicing raw files (for "stored" entries without compression)
-# and splicing of deflated files (for "deflated" storage mode).
+# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
+# for the `Streamer`.
 #
-# For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
-# before the writing of the entry body starts.
+# You can also combine output through the `Streamer` with direct output to the destination,
+# all while preserving the correct offsets in the ZIP file structures. This allows usage
+# of `sendfile()` or socket `splice()` calls for "through" proxying.
 #
-# Any object that responds to `<<` can be used as the Streamer target - you can use
-# a String, an Array, a Socket or a File, at your leisure.
+# If you want to avoid data descriptors - or write data bypassing the Streamer -
+# you need to know the CRC32 (as a uint) and the filesize upfront,
+# before the writing of the entry body starts.
 #
 # ## Using the Streamer with runtime compression
 #
@@ -34,7 +34,7 @@ require "set"
 #       end
 #     end
 #
-# The central directory will be written automatically at the end of the block.
+# The central directory will be written automatically at the end of the `open` block.
 #
 # ## Using the Streamer with entries of known size and having a known CRC32 checksum
 #
@@ -169,7 +169,7 @@ class ZipKit::Streamer
   # @param uncompressed_size [Integer] the size of the entry when uncompressed, in bytes
   # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
   # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_deflated_entry(filename:, modification_time: Time.now.utc, compressed_size: 0, uncompressed_size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
     add_file_and_write_local_header(filename: filename,
@@ -193,7 +193,7 @@ class ZipKit::Streamer
   # @param size [Integer] the size of the file when uncompressed, in bytes
   # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
   # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor. When in use
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_stored_entry(filename:, modification_time: Time.now.utc, size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
     add_file_and_write_local_header(filename: filename,
@@ -211,7 +211,7 @@ class ZipKit::Streamer
   #
   # @param dirname [String] the name of the directory in the archive
   # @param modification_time [Time] the modification time of the directory in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
   # @return [Integer] the offset the output IO is at after writing the entry header
   def add_empty_directory(dirname:, modification_time: Time.now.utc, unix_permissions: nil)
     add_file_and_write_local_header(filename: dirname.to_s + "/",
@@ -262,13 +262,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     writable = ZipKit::Streamer::Heuristic.new(self, filename, modification_time: modification_time, unix_permissions: unix_permissions)
     yield_or_return_writable(writable, &blk)
@@ -313,13 +312,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_stored_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     add_stored_entry(filename: filename,
       modification_time: modification_time,
@@ -373,13 +371,12 @@ class ZipKit::Streamer
   #
   # @param filename[String] the name of the file in the archive
   # @param modification_time [Time] the modification time of the file in the archive
-  # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
-  # @yield
-  #    sink[#<<, #write]
+  # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
+  # @yieldparam sink[ZipKit::Streamer::Writable]
   #    an object that the file contents must be written to.
   #    Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
   #    output (using `IO.copy_stream` is a good approach).
-  # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
+  # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
   def write_deflated_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
     add_deflated_entry(filename: filename,
       modification_time: modification_time,

data/lib/zip_kit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module ZipKit
-  VERSION = "6.0.1"
+  VERSION = "6.2.1"
 end

data/lib/zip_kit/write_shovel.rb CHANGED Viewed

@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
   # Writes the given data to the output stream. Allows the object to be used as
   # a target for `IO.copy_stream(from, to)`
   #
-  # @param d[String] the binary string to write (part of the uncompressed file)
-  # @return [Fixnum] the number of bytes written
+  # @param bytes[String] the binary string to write (part of the uncompressed file)
+  # @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
   def write(bytes)
     self << bytes
     bytes.bytesize