zip_kit 6.0.1 → 6.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 01e5705a5a5fd365b524814f91aaca754b983882a1ac596bb0b419fcb8be78d1
4
- data.tar.gz: 96a01fdbe7ce804c88d8b56ab6319c2e4d67b6b656d5babb7abecfccde2634db
3
+ metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
4
+ data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
5
5
  SHA512:
6
- metadata.gz: 14f84e439aa7f1dfde017fed0209f8e7d8cb23baf3cf6025d7b0321fe289d51bbdcd8f40a159dd5dbb43eb6869fb79457de04be98e26deea2e8a04765381ef3a
7
- data.tar.gz: 111a3b10851f30eea6a1c9981b68a021f611b485c99cd9d69239dbe9a40645fcdfd64fc8c8f76f1ae2d607b6bb6e8e8d2ca2a233411bf330ed226c0d0493e4d3
6
+ metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
7
+ data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e
data/CHANGELOG.md CHANGED
@@ -1,3 +1,11 @@
1
+ ## 6.2.0
2
+
3
+ * Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
4
+
5
+ ## 6.1.0
6
+
7
+ * Add Sorbet `.rbi` for type hints and resolution. This should make developing with zip_kit more pleasant, and the library - more discoverable.
8
+
1
9
  ## 6.0.1
2
10
 
3
11
  * Fix `require` for the `VERSION` constant, as Zeitwerk would try to resolve it in Rails context, bringing the entire module under its reloading.
data/README.md CHANGED
@@ -1,5 +1,8 @@
1
1
  # zip_kit
2
2
 
3
+ [![Tests](https://github.com/julik/zip_kit/actions/workflows/ci.yml/badge.svg)](https://github.com/julik/zip_kit/actions/workflows/ci.yml)
4
+ [![Gem Version](https://badge.fury.io/rb/zip_kit.svg)](https://badge.fury.io/rb/zip_kit)
5
+
3
6
  Allows streaming, non-rewinding ZIP file output from Ruby.
4
7
 
5
8
  `zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
@@ -56,7 +59,7 @@ via HTTP.
56
59
  and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
57
60
  RSpec) should work normally with controller actions returning ZIPs.
58
61
 
59
- ## Writing into other streaming destinations
62
+ ## Writing into other streaming destinations and through streaming wrappers
60
63
 
61
64
  Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
62
65
  is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -78,8 +81,6 @@ obj.upload_stream do |write_stream|
78
81
  end
79
82
  ```
80
83
 
81
- # Writing through an intermediary object
82
-
83
84
  Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
84
85
  output with [builder](https://github.com/jimweirich/builder#project-builder)
85
86
 
@@ -101,11 +102,11 @@ Ruby code that streams its output into a destination.
101
102
 
102
103
  Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
103
104
  memory inflation is going to be very constrained. Data will be written to destination at fairly regular
104
- intervals. Deflate compression will work best for things like text files.
105
+ intervals. Deflate compression will work best for things like text files. For example, here is how to
106
+ output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
105
107
 
106
108
  ```ruby
107
- out = my_tempfile # can also be a socket
108
- ZipKit::Streamer.open(out) do |zip|
109
+ ZipKit::Streamer.open($stdout) do |zip|
109
110
  zip.write_file('mov.mp4.txt') do |sink|
110
111
  File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
111
112
  end
@@ -114,14 +115,16 @@ ZipKit::Streamer.open(out) do |zip|
114
115
  end
115
116
  end
116
117
  ```
118
+
117
119
  Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
118
120
  since you do not know how large the compressed data segments are going to be.
119
121
 
120
122
  ## Send a ZIP from a Rack response
121
123
 
122
124
  zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
123
- by piece, and apply some amount of buffering as well. Make sure to also wrap your `OutputEnumerator` in a chunker
124
- by calling `#to_chunked` on it. Return it to your webserver and you will have your ZIP streamed!
125
+ by piece, and apply some amount of buffering as well. Note that you might want to wrap
126
+ it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
127
+ that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
125
128
  The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
126
129
  start executing once your response body starts getting iterated over - when actually sending
127
130
  the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
@@ -145,13 +148,11 @@ headers, streaming_body = body.to_rack_response_headers_and_body(env)
145
148
  Use the `SizeEstimator` to compute the correct size of the resulting archive.
146
149
 
147
150
  ```ruby
148
- # Precompute the Content-Length ahead of time
149
151
  bytesize = ZipKit::SizeEstimator.estimate do |z|
150
152
  z.add_stored_entry(filename: 'myfile1.bin', size: 9090821)
151
153
  z.add_stored_entry(filename: 'myfile2.bin', size: 458678)
152
154
  end
153
155
 
154
- # Prepare the response body. The block will only be called when the response starts to be written.
155
156
  zip_body = ZipKit::OutputEnumerator.new do | zip |
156
157
  zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
157
158
  zip << read_file('myfile1.bin')
@@ -171,7 +172,6 @@ the metadata of the file upfront (the CRC32 of the uncompressed file and the siz
171
172
  to that socket using some accelerated writing technique, and only use the Streamer to write out the ZIP metadata.
172
173
 
173
174
  ```ruby
174
- # io has to be an object that supports #<< or #write()
175
175
  ZipKit::Streamer.open(io) do | zip |
176
176
  # raw_file is written "as is" (STORED mode).
177
177
  # Write the local file header first..
data/Rakefile CHANGED
@@ -16,6 +16,14 @@ YARD::Rake::YardocTask.new(:doc) do |t|
16
16
  # miscellaneous documentation files that contain no code
17
17
  t.files = ["lib/**/*.rb", "-", "LICENSE.txt", "IMPLEMENTATION_DETAILS.md"]
18
18
  end
19
-
20
19
  RSpec::Core::RakeTask.new(:spec)
21
- task default: [:spec, :standard]
20
+
21
+ task :generate_typedefs do
22
+ `bundle exec sord rbi/zip_kit.rbi`
23
+ end
24
+
25
+ task default: [:spec, :standard, :generate_typedefs]
26
+
27
+ # When building the gem, generate typedefs beforehand,
28
+ # so that they get included
29
+ Rake::Task["build"].enhance(["generate_typedefs"])
@@ -137,7 +137,7 @@ class ZipKit::FileReader
137
137
  # reader = entry.extractor_from(source_file)
138
138
  # outfile << reader.extract(512 * 1024) until reader.eof?
139
139
  #
140
- # @return [#extract(n_bytes), #eof?] the reader for the data
140
+ # @return [StoredReader,InflatingReader] the reader for the data
141
141
  def extractor_from(from_io)
142
142
  from_io.seek(compressed_data_offset, IO::SEEK_SET)
143
143
  case storage_mode
@@ -28,15 +28,11 @@ require "time" # for .httpdate
28
28
  # end
29
29
  # end
30
30
  #
31
- # Either as a `Transfer-Encoding: chunked` response (if your webserver supports it),
32
- # which will give you true streaming capability:
31
+ # You can grab the headers one usually needs for streaming from `#streaming_http_headers`:
33
32
  #
34
- # headers, chunked_or_presized_rack_body = iterable_zip_body.to_headers_and_rack_response_body(env)
35
- # [200, headers, chunked_or_presized_rack_body]
33
+ # [200, iterable_zip_body.streaming_http_headers, iterable_zip_body]
36
34
  #
37
- # or it will wrap your output in a `TempfileBody` object which buffers the ZIP before output. Buffering has
38
- # benefits if your webserver does not support anything beyound HTTP/1.0, and also engages automatically
39
- # in unit tests (since rack-test and Rails tests do not do streaming HTTP/1.1).
35
+ # to bypass things like `Rack::ETag` and the nginx buffering.
40
36
  class ZipKit::OutputEnumerator
41
37
  DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
42
38
 
@@ -103,17 +99,11 @@ class ZipKit::OutputEnumerator
103
99
  end
104
100
  end
105
101
 
106
- # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
107
- # an object that can be used as a Rack response body. The method will automatically
108
- # switch the wrapping of the output depending on whether the response can be pre-sized,
109
- # and whether your downstream webserver (like nginx) is configured to support
110
- # the HTTP/1.1 protocol version.
102
+ # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
111
103
  #
112
- # @param rack_env[Hash] the Rack env, which the method may need to mutate (adding a Tempfile for cleanup)
113
- # @param content_length[Integer] the amount of bytes that the archive will contain. If given, no Chunked encoding gets applied.
114
- # @return [Array]
115
- def to_headers_and_rack_response_body(rack_env, content_length: nil)
116
- headers = {
104
+ # @return [Hash]
105
+ def streaming_http_headers
106
+ _headers = {
117
107
  # We need to ensure Rack::ETag does not suddenly start buffering us, see
118
108
  # https://github.com/rack/rack/issues/1619#issuecomment-606315714
119
109
  # Set this even when not streaming for consistency. The fact that there would be
@@ -124,27 +114,19 @@ class ZipKit::OutputEnumerator
124
114
  "Content-Encoding" => "identity",
125
115
  # Disable buffering for both nginx and Google Load Balancer, see
126
116
  # https://cloud.google.com/appengine/docs/flexible/how-requests-are-handled?tab=python#x-accel-buffering
127
- "X-Accel-Buffering" => "no"
117
+ "X-Accel-Buffering" => "no",
118
+ # Set the correct content type. This should be overridden if you need to
119
+ # serve things such as EPubs and other derived ZIP formats.
120
+ "Content-Type" => "application/zip"
128
121
  }
122
+ end
129
123
 
130
- if content_length
131
- # If we know the size of the body, transfer encoding is not required at all - so the enumerator itself
132
- # can function as the Rack body. This also would apply in HTTP/2 contexts where chunked encoding would
133
- # no longer be required - then the enumerator could get returned "bare".
134
- body = self
135
- headers["Content-Length"] = content_length.to_i.to_s
136
- elsif rack_env["HTTP_VERSION"] == "HTTP/1.0"
137
- # Check for the proxy configuration first. This is the first common misconfiguration which destroys streaming -
138
- # since HTTP 1.0 does not support chunked responses we need to revert to buffering. The issue though is that
139
- # this reversion happens silently and it is usually not clear at all why streaming does not work. So let's at
140
- # the very least print it to the Rails log.
141
- body = ZipKit::RackTempfileBody.new(rack_env, self)
142
- headers["Content-Length"] = body.size.to_s
143
- else
144
- body = ZipKit::RackChunkedBody.new(self)
145
- headers["Transfer-Encoding"] = "chunked"
146
- end
147
-
148
- [headers, body]
124
+ # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
125
+ # an object that can be used as a Rack response body. This method used to accept arguments
126
+ # but will now just ignore them.
127
+ #
128
+ # @return [Array]
129
+ def to_headers_and_rack_response_body(*, **)
130
+ [streaming_http_headers, self]
149
131
  end
150
132
  end
@@ -1,7 +1,9 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  # Contains a file handle which can be closed once the response finishes sending.
4
- # It supports `to_path` so that `Rack::Sendfile` can intercept it
4
+ # It supports `to_path` so that `Rack::Sendfile` can intercept it.
5
+ # This class is deprecated and is going to be removed in zip_kit 7.x
6
+ # @api deprecated
5
7
  class ZipKit::RackTempfileBody
6
8
  TEMPFILE_NAME_PREFIX = "zip-tricks-tf-body-"
7
9
  attr_reader :tempfile
@@ -7,14 +7,15 @@ module ZipKit::RailsStreaming
7
7
  # the Rails response stream is going to be closed automatically.
8
8
  # @param filename[String] name of the file for the Content-Disposition header
9
9
  # @param type[String] the content type (MIME type) of the archive being output
10
+ # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
10
11
  # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
11
12
  # See {ZipKit::Streamer#initialize} for the full list of options.
12
- # @yield [Streamer] the streamer that can be written to
13
+ # @yieldparam [ZipKit::Streamer] the streamer that can be written to
13
14
  # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
14
- def zip_kit_stream(filename: "download.zip", type: "application/zip", **zip_streamer_options, &zip_streaming_blk)
15
+ def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
15
16
  # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
16
17
  # first will also validate the Streamer options.
17
- chunk_yielder = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
18
+ output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
18
19
 
19
20
  # We want some common headers for file sending. Rails will also set
20
21
  # self.sending_file = true for us when we call send_file_headers!
@@ -28,10 +29,16 @@ module ZipKit::RailsStreaming
28
29
  logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
29
30
  end
30
31
 
31
- headers, rack_body = chunk_yielder.to_headers_and_rack_response_body(request.env)
32
+ headers = output_enum.streaming_http_headers
33
+
34
+ # In rare circumstances (such as the app using Rack::ContentLength - which should normally
35
+ # not be used allow the user to force the use of the chunked encoding
36
+ if use_chunked_transfer_encoding
37
+ output_enum = ZipKit::RackChunkedBody.new(output_enum)
38
+ headers["Transfer-Encoding"] = "chunked"
39
+ end
32
40
 
33
- # Set the "particular" streaming headers
34
41
  response.headers.merge!(headers)
35
- self.response_body = rack_body
42
+ self.response_body = output_enum
36
43
  end
37
44
  end
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "uri"
4
+
3
5
  # An object that fakes just-enough of an IO to be dangerous
4
6
  # - or, more precisely, to be useful as a source for the FileReader
5
7
  # central directory parser. Effectively we substitute an IO object
@@ -6,6 +6,8 @@ class ZipKit::SizeEstimator
6
6
 
7
7
  # Creates a new estimator with a Streamer object. Normally you should use
8
8
  # `estimate` instead an not use this method directly.
9
+ #
10
+ # @param streamer[ZipKit::Streamer]
9
11
  def initialize(streamer)
10
12
  @streamer = streamer
11
13
  end
@@ -22,7 +24,7 @@ class ZipKit::SizeEstimator
22
24
  #
23
25
  # @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
24
26
  # @return [Integer] the size of the resulting archive, in bytes
25
- # @yield [SizeEstimator] the estimator
27
+ # @yieldparam [SizeEstimator] the estimator
26
28
  def self.estimate(**kwargs_for_streamer_new)
27
29
  streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
28
30
  estimator = new(streamer)
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "zlib"
4
+
3
5
  # Will be used to pick whether to store a file in the `stored` or
4
6
  # `deflated` mode, by compressing the first N bytes of the file and
5
7
  # comparing the stored and deflated data sizes. If deflate produces
@@ -10,9 +12,7 @@
10
12
  # Heuristic will call either `write_stored_file` or `write_deflated_file`
11
13
  # on the Streamer passed into it once it knows which compression
12
14
  # method should be applied
13
- class ZipKit::Streamer::Heuristic
14
- include ZipKit::WriteShovel
15
-
15
+ class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
16
16
  BYTES_WRITTEN_THRESHOLD = 128 * 1024
17
17
  MINIMUM_VIABLE_COMPRESSION = 0.75
18
18
 
@@ -39,11 +39,6 @@ class ZipKit::Streamer::Heuristic
39
39
  self
40
40
  end
41
41
 
42
- def write(bytes)
43
- self << bytes
44
- bytes.bytesize
45
- end
46
-
47
42
  def close
48
43
  decide unless @winner
49
44
  @winner.close
@@ -169,7 +169,7 @@ class ZipKit::Streamer
169
169
  # @param uncompressed_size [Integer] the size of the entry when uncompressed, in bytes
170
170
  # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
171
171
  # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor
172
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
172
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
173
173
  # @return [Integer] the offset the output IO is at after writing the entry header
174
174
  def add_deflated_entry(filename:, modification_time: Time.now.utc, compressed_size: 0, uncompressed_size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
175
175
  add_file_and_write_local_header(filename: filename,
@@ -193,7 +193,7 @@ class ZipKit::Streamer
193
193
  # @param size [Integer] the size of the file when uncompressed, in bytes
194
194
  # @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
195
195
  # @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor. When in use
196
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
196
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
197
197
  # @return [Integer] the offset the output IO is at after writing the entry header
198
198
  def add_stored_entry(filename:, modification_time: Time.now.utc, size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
199
199
  add_file_and_write_local_header(filename: filename,
@@ -211,7 +211,7 @@ class ZipKit::Streamer
211
211
  #
212
212
  # @param dirname [String] the name of the directory in the archive
213
213
  # @param modification_time [Time] the modification time of the directory in the archive
214
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
214
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
215
215
  # @return [Integer] the offset the output IO is at after writing the entry header
216
216
  def add_empty_directory(dirname:, modification_time: Time.now.utc, unix_permissions: nil)
217
217
  add_file_and_write_local_header(filename: dirname.to_s + "/",
@@ -262,13 +262,12 @@ class ZipKit::Streamer
262
262
  #
263
263
  # @param filename[String] the name of the file in the archive
264
264
  # @param modification_time [Time] the modification time of the file in the archive
265
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
266
- # @yield
267
- # sink[#<<, #write]
265
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
266
+ # @yieldparam sink[ZipKit::Streamer::Writable]
268
267
  # an object that the file contents must be written to.
269
268
  # Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
270
269
  # output (using `IO.copy_stream` is a good approach).
271
- # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
270
+ # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
272
271
  def write_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
273
272
  writable = ZipKit::Streamer::Heuristic.new(self, filename, modification_time: modification_time, unix_permissions: unix_permissions)
274
273
  yield_or_return_writable(writable, &blk)
@@ -313,13 +312,12 @@ class ZipKit::Streamer
313
312
  #
314
313
  # @param filename[String] the name of the file in the archive
315
314
  # @param modification_time [Time] the modification time of the file in the archive
316
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
317
- # @yield
318
- # sink[#<<, #write]
315
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
316
+ # @yieldparam sink[ZipKit::Streamer::Writable]
319
317
  # an object that the file contents must be written to.
320
318
  # Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
321
319
  # output (using `IO.copy_stream` is a good approach).
322
- # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
320
+ # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
323
321
  def write_stored_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
324
322
  add_stored_entry(filename: filename,
325
323
  modification_time: modification_time,
@@ -373,13 +371,12 @@ class ZipKit::Streamer
373
371
  #
374
372
  # @param filename[String] the name of the file in the archive
375
373
  # @param modification_time [Time] the modification time of the file in the archive
376
- # @param unix_permissions[Fixnum?] which UNIX permissions to set, normally the default should be used
377
- # @yield
378
- # sink[#<<, #write]
374
+ # @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
375
+ # @yieldparam sink[ZipKit::Streamer::Writable]
379
376
  # an object that the file contents must be written to.
380
377
  # Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
381
378
  # output (using `IO.copy_stream` is a good approach).
382
- # @return [#<<, #write, #close] an object that the file contents must be written to, has to be closed manually
379
+ # @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
383
380
  def write_deflated_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
384
381
  add_deflated_entry(filename: filename,
385
382
  modification_time: modification_time,
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ZipKit
4
- VERSION = "6.0.1"
4
+ VERSION = "6.2.0"
5
5
  end