zip_kit 6.2.0 → 6.2.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
4
- data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
3
+ metadata.gz: e1136ebba851638486c9e47150a8d706c49a2bbc0074f457c794582d8ce19089
4
+ data.tar.gz: 80de3edcb5bc748aaf855a7bf0b1f19439522c8efa4b97754f813fe9413bac2c
5
5
  SHA512:
6
- metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
7
- data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e
6
+ metadata.gz: 20c5922a4178f2068a4f06388b201bd263f01c387d308c2c6297feba1c05385d601072cae0451d59a4a0b4e1ba1e354a6fa7f622ff1b58daf70947e6991b1e82
7
+ data.tar.gz: c373972ec6980000b40d1808247759b44f317a7aa3795b406e02005412cf0687e0f2311e5809f011eb1fbc19e6b2b7eb2a6a8f036cafe27a2645f6476cf0c441
data/CHANGELOG.md CHANGED
@@ -1,3 +1,15 @@
1
+ ## 6.2.2
2
+
3
+ * Make sure "zlib" gets required at the top, as it is used everywhere
4
+ * Improve documentation
5
+ * Make sure `zip_kit_stream` honors the custom `Content-Type` parameter
6
+ * Add a streaming example with Sinatra (and add a Sinatra app to the test harness)
7
+
8
+ ## 6.2.1
9
+
10
+ * Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
11
+ * Make `BlockWrite` respond to `write` in addition to `<<`
12
+
1
13
  ## 6.2.0
2
14
 
3
15
  * Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
@@ -149,7 +161,7 @@
149
161
  ## 4.4.2
150
162
 
151
163
  * Add 2.4 to Travis rubies
152
- * Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_kit/pull/14)
164
+ * Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
153
165
 
154
166
  ## 4.4.1
155
167
 
data/CONTRIBUTING.md CHANGED
@@ -106,11 +106,11 @@ project:
106
106
 
107
107
  ```bash
108
108
  # Clone your fork of the repo into the current directory
109
- git clone git@github.com:WeTransfer/zip_kit.git
109
+ git clone git@github.com:julik/zip_kit.git
110
110
  # Navigate to the newly cloned directory
111
111
  cd zip_kit
112
112
  # Assign the original repo to a remote called "upstream"
113
- git remote add upstream git@github.com:WeTransfer/zip_kit.git
113
+ git remote add upstream git@github.com:julik/zip_kit.git
114
114
  ```
115
115
 
116
116
  2. If you cloned a while ago, get the latest changes from upstream:
data/README.md CHANGED
@@ -55,11 +55,11 @@ If you want some more conveniences you can also use [zipline](https://github.com
55
55
  will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
56
56
  via HTTP.
57
57
 
58
- `RailsStreaming` will *not* use [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
59
- and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
60
- RSpec) should work normally with controller actions returning ZIPs.
58
+ `RailsStreaming` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
59
+ and will stream without it. See {ZipKit::RailsStreaming#zip_kit_stream} for more details on this. You can use it
60
+ together with `Live` just fine if you need to.
61
61
 
62
- ## Writing into other streaming destinations and through streaming wrappers
62
+ ## Writing into streaming destinations
63
63
 
64
64
  Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
65
65
  is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -69,23 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
69
69
  obj = bucket.object("big.zip")
70
70
  obj.upload_stream do |write_stream|
71
71
  ZipKit::Streamer.open(write_stream) do |zip|
72
- zip.write_file("large.csv") do |sink|
73
- CSV(sink) do |csv|
74
- csv << ["Line", "Item"]
75
- 20_000.times do |n|
76
- csv << [n, "Item number #{n}"]
77
- end
72
+ zip.write_file("file.csv") do |sink|
73
+ File.open("large.csv", "rb") do |file_input|
74
+ IO.copy_stream(file_input, sink)
78
75
  end
79
76
  end
80
77
  end
81
78
  end
82
79
  ```
83
80
 
81
+ ## Writing through streaming wrappers
82
+
84
83
  Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
85
- output with [builder](https://github.com/jimweirich/builder#project-builder)
84
+ output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
85
+ every time a complete write call is done:
86
86
 
87
87
  ```ruby
88
- zip.write_file('report1.csv') do |sink|
88
+ zip.write_file('employees.xml') do |sink|
89
89
  builder = Builder::XmlMarkup.new(target: sink, indent: 2)
90
90
  builder.people do
91
91
  Person.all.find_each do |person|
@@ -95,14 +95,30 @@ zip.write_file('report1.csv') do |sink|
95
95
  end
96
96
  ```
97
97
 
98
- and this output will be compressed and output into the ZIP file on the fly. zip_kit composes with any
99
- Ruby code that streams its output into a destination.
98
+ The output will be compressed and output into the ZIP file on the fly. Same for CSV:
100
99
 
101
- ## Create a ZIP file without size estimation, compress on-the-fly during writes
100
+ ```ruby
101
+ zip.write_file('line_items.csv') do |sink|
102
+ CSV(sink) do |csv|
103
+ csv << ["Line", "Item"]
104
+ 20_000.times do |n|
105
+ csv << [n, "Item number #{n}"]
106
+ end
107
+ end
108
+ end
109
+ ```
110
+
111
+ ## Automatic storage mode (stored vs. deflated)
112
+
113
+ The ZIP file format allows storage in both compressed and raw storage modes. The raw ("stored")
114
+ mode does not require decompression and unarchives faster.
102
115
 
103
- Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
104
- memory inflation is going to be very constrained. Data will be written to destination at fairly regular
105
- intervals. Deflate compression will work best for things like text files. For example, here is how to
116
+ ZipKit will buffer a small amount of output and attempt to compress it using deflate compression.
117
+ If this turns out to be significantly smaller than raw data, it is then going to proceed with
118
+ all further output using deflate compression. Memory use is going to be very modest, but it allows
119
+ you to not have to think about the appropriate storage mode.
120
+
121
+ Deflate compression will work great for JSONs, CSVs and other text- or text-like formats. For example, here is how to
106
122
  output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
107
123
 
108
124
  ```ruby
@@ -116,18 +132,16 @@ ZipKit::Streamer.open($stdout) do |zip|
116
132
  end
117
133
  ```
118
134
 
119
- Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
120
- since you do not know how large the compressed data segments are going to be.
135
+ If you want to use specific storage modes, use `write_deflated_file` and `write_stored_file` instead of
136
+ `write_file`.
121
137
 
122
138
  ## Send a ZIP from a Rack response
123
139
 
124
140
  zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
125
- by piece, and apply some amount of buffering as well. Note that you might want to wrap
126
- it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
127
- that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
128
- The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
129
- start executing once your response body starts getting iterated over - when actually sending
130
- the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
141
+ by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
142
+ and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
143
+ the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
144
+ over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
131
145
 
132
146
  ```ruby
133
147
  body = ZipKit::OutputEnumerator.new do | zip |
@@ -139,13 +153,16 @@ body = ZipKit::OutputEnumerator.new do | zip |
139
153
  end
140
154
  end
141
155
 
142
- headers, streaming_body = body.to_rack_response_headers_and_body(env)
143
- [200, headers, streaming_body]
156
+ [200, body.streaming_http_headers, body]
144
157
  ```
145
158
 
146
159
  ## Send a ZIP file of known size, with correct headers
147
160
 
148
- Use the `SizeEstimator` to compute the correct size of the resulting archive.
161
+ Sending a file with data descriptors is not always desirable - you don't really know how large your ZIP is going to be.
162
+ If you want to present your users with proper download progress, you would need to set a `Content-Length` header - and
163
+ know ahead of time how large your download is going to be. This can be done with ZipKit, provided you know how large
164
+ the compressed versions of your file are going to be. Use the {ZipKit::SizeEstimator} to do the pre-calculation - it
165
+ is not going to produce any large amounts of output, and will give you a to-the-byte value for your future archive:
149
166
 
150
167
  ```ruby
151
168
  bytesize = ZipKit::SizeEstimator.estimate do |z|
@@ -160,8 +177,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
160
177
  zip << read_file('myfile2.bin')
161
178
  end
162
179
 
163
- headers, streaming_body = body.to_rack_response_headers_and_body(env, content_length: bytesize)
164
- [200, headers, streaming_body]
180
+ hh = zip_body.streaming_http_headers
181
+ hh["Content-Length"] = bytesize.to_s
182
+
183
+ [200, hh, zip_body]
165
184
  ```
166
185
 
167
186
  ## Writing ZIP files using the Streamer bypass
@@ -0,0 +1,16 @@
1
+ require "sinatra/base"
2
+
3
+ class SinatraApp < Sinatra::Base
4
+ get "/" do
5
+ content_type :zip
6
+ stream do |out|
7
+ ZipKit::Streamer.open(out) do |z|
8
+ z.write_file(File.basename(__FILE__)) do |io|
9
+ File.open(__FILE__, "r") do |f|
10
+ IO.copy_stream(f, io)
11
+ end
12
+ end
13
+ end
14
+ end
15
+ end
16
+ end
@@ -1,7 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require "zlib"
4
-
5
3
  # Permits Deflate compression in independent blocks. The workflow is as follows:
6
4
  #
7
5
  # * Run every block to compress through deflate_chunk, remove the header,
@@ -17,9 +17,12 @@
17
17
  # end
18
18
  # [200, {}, MyRackResponse.new]
19
19
  class ZipKit::BlockWrite
20
+ include ZipKit::WriteShovel
21
+
20
22
  # Creates a new BlockWrite.
21
23
  #
22
24
  # @param block The block that will be called when this object receives the `<<` message
25
+ # @yieldparam bytes[String] A string in binary encoding which has just been written into the object
23
26
  def initialize(&block)
24
27
  @block = block
25
28
  end
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
36
39
  # @param buf[String] the string to write. Note that a zero-length String
37
40
  # will not be forwarded to the block, as it has special meaning when used
38
41
  # with chunked encoding (it indicates the end of the stream).
39
- # @return self
42
+ # @return [ZipKit::BlockWrite]
40
43
  def <<(buf)
41
44
  # Zero-size output has a special meaning when using chunked encoding
42
45
  return if buf.nil? || buf.bytesize.zero?
@@ -34,6 +34,15 @@ require "time" # for .httpdate
34
34
  #
35
35
  # to bypass things like `Rack::ETag` and the nginx buffering.
36
36
  class ZipKit::OutputEnumerator
37
+ # With HTTP output it is better to apply a small amount of buffering. While Streamer
38
+ # output does not buffer at all, the `OutputEnumerator` does as it is going to
39
+ # be used as a Rack response body. Applying some buffering helps reduce the number
40
+ # of syscalls for otherwise tiny writes, which relieves the app webserver from
41
+ # doing too much work managing those writes. While we recommend buffering, the
42
+ # buffer size is configurable via the constructor - so you can disable buffering
43
+ # if you really need to. While ZipKit ams not to buffer, in this instance this
44
+ # buffering is justified. See https://github.com/WeTransfer/zip_tricks/issues/78
45
+ # for the background on buffering.
37
46
  DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
38
47
 
39
48
  # Creates a new OutputEnumerator enumerator. The enumerator can be read from using `each`,
@@ -60,14 +69,11 @@ class ZipKit::OutputEnumerator
60
69
  # ...
61
70
  # end
62
71
  #
63
- # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
64
- # @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
65
- #
66
72
  # @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
67
73
  # @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
68
74
  # it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
69
75
  # object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
70
- # but at block size boundaries or greater). Set it to 0 for unbuffered writes.
76
+ # but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
71
77
  # @param blk a block that will receive the Streamer object when executing. The block will not be executed
72
78
  # immediately but only once `each` is called on the OutputEnumerator
73
79
  def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
@@ -100,9 +106,14 @@ class ZipKit::OutputEnumerator
100
106
  end
101
107
 
102
108
  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
109
+ # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
110
+ # particular response body getting served. You might want to override the headers with your particular
111
+ # ones - for example, specific content types are needed for files which are, technically, ZIP files
112
+ # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
113
+ # and ePubs.
103
114
  #
104
115
  # @return [Hash]
105
- def streaming_http_headers
116
+ def self.streaming_http_headers
106
117
  _headers = {
107
118
  # We need to ensure Rack::ETag does not suddenly start buffering us, see
108
119
  # https://github.com/rack/rack/issues/1619#issuecomment-606315714
@@ -121,6 +132,15 @@ class ZipKit::OutputEnumerator
121
132
  }
122
133
  end
123
134
 
135
+ # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
136
+ # presized responses, but is now effectively a no-op.
137
+ #
138
+ # @see [ZipKit::OutputEnumerator.streaming_http_headers]
139
+ # @return [Hash]
140
+ def streaming_http_headers
141
+ self.class.streaming_http_headers
142
+ end
143
+
124
144
  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
125
145
  # an object that can be used as a Rack response body. This method used to accept arguments
126
146
  # but will now just ignore them.
@@ -5,18 +5,29 @@ module ZipKit::RailsStreaming
5
5
  # Opens a {ZipKit::Streamer} and yields it to the caller. The output of the streamer
6
6
  # gets automatically forwarded to the Rails response stream. When the output completes,
7
7
  # the Rails response stream is going to be closed automatically.
8
+ #
9
+ # Note that there is an important difference in how this method works, depending whether
10
+ # you use it in a controller which includes `ActionController::Live` vs. one that does not.
11
+ # With a standard `ActionController` this method will assign a response body, but streaming
12
+ # will begin when your action method returns. With `ActionController::Live` the streaming
13
+ # will begin immediately, before the method returns. In all other aspects the method should
14
+ # stream correctly in both types of controllers.
15
+ #
16
+ # If you encounter buffering (streaming does not start for a very long time) you probably
17
+ # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
18
+ # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
19
+ # always possible. If you encounter buffering, examine your middleware stack and try to suss
20
+ # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
21
+ # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
22
+ #
8
23
  # @param filename[String] name of the file for the Content-Disposition header
9
24
  # @param type[String] the content type (MIME type) of the archive being output
10
25
  # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
11
- # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
12
- # See {ZipKit::Streamer#initialize} for the full list of options.
26
+ # @param output_enumerator_options[Hash] options that will be passed to the OutputEnumerator - these include
27
+ # options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
13
28
  # @yieldparam [ZipKit::Streamer] the streamer that can be written to
14
- # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
15
- def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
16
- # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
17
- # first will also validate the Streamer options.
18
- output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
19
-
29
+ # @return [Boolean] always returns true
30
+ def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk)
20
31
  # We want some common headers for file sending. Rails will also set
21
32
  # self.sending_file = true for us when we call send_file_headers!
22
33
  send_file_headers!(type: type, filename: filename)
@@ -29,16 +40,46 @@ module ZipKit::RailsStreaming
29
40
  logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
30
41
  end
31
42
 
32
- headers = output_enum.streaming_http_headers
43
+ headers = ZipKit::OutputEnumerator.streaming_http_headers
44
+
45
+ # Allow Rails headers to override ours. This is important if, for example, a content type gets
46
+ # set to something else than "application/zip"
47
+ response.headers.reverse_merge!(headers)
48
+
49
+ # The output enumerator yields chunks of bytes generated from the Streamer,
50
+ # with some buffering. See OutputEnumerator docs for more.
51
+ rack_zip_body = ZipKit::OutputEnumerator.new(**output_enumerator_options, &zip_streaming_blk)
33
52
 
34
- # In rare circumstances (such as the app using Rack::ContentLength - which should normally
35
- # not be used allow the user to force the use of the chunked encoding
53
+ # Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
54
+ # Rack::ContentLength is normally not in a Rails middleware stack, but it might get
55
+ # introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
56
+ # There is a recommendation to leave the chunked encoding to the app server, so that servers
57
+ # that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
58
+ # see https://github.com/julik/zip_kit/issues/7
59
+ # But it is not to be excluded that a user may need to force the chunked encoding to bypass
60
+ # some especially pesky Rack middleware that just would not cooperate. Those include
61
+ # Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
36
62
  if use_chunked_transfer_encoding
37
- output_enum = ZipKit::RackChunkedBody.new(output_enum)
38
- headers["Transfer-Encoding"] = "chunked"
63
+ response.headers["Transfer-Encoding"] = "chunked"
64
+ rack_zip_body = ZipKit::RackChunkedBody.new(rack_zip_body)
65
+ end
66
+
67
+ # Time for some branching, which mostly has to do with the 999 flavours of
68
+ # "how to make both Rails and Rack stream"
69
+ if self.class.ancestors.include?(ActionController::Live)
70
+ # If this controller includes Live it will not work correctly with a Rack
71
+ # response body assignment - the action will just hang. We need to read out the response
72
+ # body ourselves and write it into the Rails stream.
73
+ begin
74
+ rack_zip_body.each { |bytes| response.stream.write(bytes) }
75
+ ensure
76
+ response.stream.close
77
+ end
78
+ else
79
+ # Stream using a Rack body assigned to the ActionController response body
80
+ self.response_body = rack_zip_body
39
81
  end
40
82
 
41
- response.headers.merge!(headers)
42
- self.response_body = output_enum
83
+ true
43
84
  end
44
85
  end
@@ -2,19 +2,19 @@
2
2
 
3
3
  require "set"
4
4
 
5
- # Is used to write streamed ZIP archives into the provided IO-ish object.
6
- # The output IO is never going to be rewound or seeked, so the output
7
- # of this object can be coupled directly to, say, a Rack output. The
8
- # output can also be a String, Array or anything that responds to `<<`.
5
+ # Is used to write ZIP archives without having to read them back or to overwrite
6
+ # data. It outputs into any object that supports `<<` or `write`, namely:
9
7
  #
10
- # Allows for splicing raw files (for "stored" entries without compression)
11
- # and splicing of deflated files (for "deflated" storage mode).
8
+ # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
9
+ # for the `Streamer`.
12
10
  #
13
- # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
14
- # before the writing of the entry body starts.
11
+ # You can also combine output through the `Streamer` with direct output to the destination,
12
+ # all while preserving the correct offsets in the ZIP file structures. This allows usage
13
+ # of `sendfile()` or socket `splice()` calls for "through" proxying.
15
14
  #
16
- # Any object that responds to `<<` can be used as the Streamer target - you can use
17
- # a String, an Array, a Socket or a File, at your leisure.
15
+ # If you want to avoid data descriptors - or write data bypassing the Streamer -
16
+ # you need to know the CRC32 (as a uint) and the filesize upfront,
17
+ # before the writing of the entry body starts.
18
18
  #
19
19
  # ## Using the Streamer with runtime compression
20
20
  #
@@ -34,7 +34,7 @@ require "set"
34
34
  # end
35
35
  # end
36
36
  #
37
- # The central directory will be written automatically at the end of the block.
37
+ # The central directory will be written automatically at the end of the `open` block.
38
38
  #
39
39
  # ## Using the Streamer with entries of known size and having a known CRC32 checksum
40
40
  #
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ZipKit
4
- VERSION = "6.2.0"
4
+ VERSION = "6.2.2"
5
5
  end
@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
13
13
  # Writes the given data to the output stream. Allows the object to be used as
14
14
  # a target for `IO.copy_stream(from, to)`
15
15
  #
16
- # @param d[String] the binary string to write (part of the uncompressed file)
17
- # @return [Fixnum] the number of bytes written
16
+ # @param bytes[String] the binary string to write (part of the uncompressed file)
17
+ # @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
18
18
  def write(bytes)
19
19
  self << bytes
20
20
  bytes.bytesize
data/lib/zip_kit.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require_relative "zip_kit/version"
4
+ require "zlib"
4
5
 
5
6
  module ZipKit
6
7
  autoload :OutputEnumerator, File.dirname(__FILE__) + "/zip_kit/rack_body.rb"
data/rbi/zip_kit.rbi CHANGED
@@ -1,6 +1,6 @@
1
1
  # typed: strong
2
2
  module ZipKit
3
- VERSION = T.let("6.2.0", T.untyped)
3
+ VERSION = T.let("6.2.2", T.untyped)
4
4
 
5
5
  # A ZIP archive contains a flat list of entries. These entries can implicitly
6
6
  # create directories when the archive is expanded. For example, an entry with
@@ -100,19 +100,19 @@ module ZipKit
100
100
  end
101
101
  end
102
102
 
103
- # Is used to write streamed ZIP archives into the provided IO-ish object.
104
- # The output IO is never going to be rewound or seeked, so the output
105
- # of this object can be coupled directly to, say, a Rack output. The
106
- # output can also be a String, Array or anything that responds to `<<`.
103
+ # Is used to write ZIP archives without having to read them back or to overwrite
104
+ # data. It outputs into any object that supports `<<` or `write`, namely:
107
105
  #
108
- # Allows for splicing raw files (for "stored" entries without compression)
109
- # and splicing of deflated files (for "deflated" storage mode).
106
+ # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
107
+ # for the `Streamer`.
110
108
  #
111
- # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
112
- # before the writing of the entry body starts.
109
+ # You can also combine output through the `Streamer` with direct output to the destination,
110
+ # all while preserving the correct offsets in the ZIP file structures. This allows usage
111
+ # of `sendfile()` or socket `splice()` calls for "through" proxying.
113
112
  #
114
- # Any object that responds to `<<` can be used as the Streamer target - you can use
115
- # a String, an Array, a Socket or a File, at your leisure.
113
+ # If you want to avoid data descriptors - or write data bypassing the Streamer -
114
+ # you need to know the CRC32 (as a uint) and the filesize upfront,
115
+ # before the writing of the entry body starts.
116
116
  #
117
117
  # ## Using the Streamer with runtime compression
118
118
  #
@@ -132,7 +132,7 @@ module ZipKit
132
132
  # end
133
133
  # end
134
134
  #
135
- # The central directory will be written automatically at the end of the block.
135
+ # The central directory will be written automatically at the end of the `open` block.
136
136
  #
137
137
  # ## Using the Streamer with entries of known size and having a known CRC32 checksum
138
138
  #
@@ -563,13 +563,12 @@ module ZipKit
563
563
  sig { params(filename: T.untyped).returns(T.untyped) }
564
564
  def remove_backslash(filename); end
565
565
 
566
- # sord infer - argument name in single @param inferred as "bytes"
567
566
  # Writes the given data to the output stream. Allows the object to be used as
568
567
  # a target for `IO.copy_stream(from, to)`
569
568
  #
570
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
569
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
571
570
  #
572
- # _@return_ — the number of bytes written
571
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
573
572
  sig { params(bytes: String).returns(Fixnum) }
574
573
  def write(bytes); end
575
574
 
@@ -678,13 +677,12 @@ module ZipKit
678
677
  sig { returns(T.untyped) }
679
678
  def close; end
680
679
 
681
- # sord infer - argument name in single @param inferred as "bytes"
682
680
  # Writes the given data to the output stream. Allows the object to be used as
683
681
  # a target for `IO.copy_stream(from, to)`
684
682
  #
685
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
683
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
686
684
  #
687
- # _@return_ — the number of bytes written
685
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
688
686
  sig { params(bytes: String).returns(Fixnum) }
689
687
  def write(bytes); end
690
688
  end
@@ -748,13 +746,12 @@ module ZipKit
748
746
  sig { returns(T::Hash[T.untyped, T.untyped]) }
749
747
  def finish; end
750
748
 
751
- # sord infer - argument name in single @param inferred as "bytes"
752
749
  # Writes the given data to the output stream. Allows the object to be used as
753
750
  # a target for `IO.copy_stream(from, to)`
754
751
  #
755
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
752
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
756
753
  #
757
- # _@return_ — the number of bytes written
754
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
758
755
  sig { params(bytes: String).returns(Fixnum) }
759
756
  def write(bytes); end
760
757
  end
@@ -787,13 +784,12 @@ module ZipKit
787
784
  sig { returns(T::Hash[T.untyped, T.untyped]) }
788
785
  def finish; end
789
786
 
790
- # sord infer - argument name in single @param inferred as "bytes"
791
787
  # Writes the given data to the output stream. Allows the object to be used as
792
788
  # a target for `IO.copy_stream(from, to)`
793
789
  #
794
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
790
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
795
791
  #
796
- # _@return_ — the number of bytes written
792
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
797
793
  sig { params(bytes: String).returns(Fixnum) }
798
794
  def write(bytes); end
799
795
  end
@@ -1107,19 +1103,28 @@ end, T.untyped)
1107
1103
  # end
1108
1104
  # [200, {}, MyRackResponse.new]
1109
1105
  class BlockWrite
1106
+ include ZipKit::WriteShovel
1107
+
1110
1108
  # Creates a new BlockWrite.
1111
1109
  #
1112
1110
  # _@param_ `block` — The block that will be called when this object receives the `<<` message
1113
- sig { params(block: T.untyped).void }
1111
+ sig { params(block: T.proc.params(bytes: String).void).void }
1114
1112
  def initialize(&block); end
1115
1113
 
1116
1114
  # Sends a string through to the block stored in the BlockWrite.
1117
1115
  #
1118
1116
  # _@param_ `buf` — the string to write. Note that a zero-length String will not be forwarded to the block, as it has special meaning when used with chunked encoding (it indicates the end of the stream).
1119
- #
1120
- # _@return_ — self
1121
- sig { params(buf: String).returns(T.untyped) }
1117
+ sig { params(buf: String).returns(ZipKit::BlockWrite) }
1122
1118
  def <<(buf); end
1119
+
1120
+ # Writes the given data to the output stream. Allows the object to be used as
1121
+ # a target for `IO.copy_stream(from, to)`
1122
+ #
1123
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1124
+ #
1125
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1126
+ sig { params(bytes: String).returns(Fixnum) }
1127
+ def write(bytes); end
1123
1128
  end
1124
1129
 
1125
1130
  # A very barebones ZIP file reader. Is made for maximum interoperability, but at the same
@@ -1657,13 +1662,12 @@ end, T.untyped)
1657
1662
  sig { params(crc32: Fixnum, blob_size: Fixnum).returns(Fixnum) }
1658
1663
  def append(crc32, blob_size); end
1659
1664
 
1660
- # sord infer - argument name in single @param inferred as "bytes"
1661
1665
  # Writes the given data to the output stream. Allows the object to be used as
1662
1666
  # a target for `IO.copy_stream(from, to)`
1663
1667
  #
1664
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1668
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1665
1669
  #
1666
- # _@return_ — the number of bytes written
1670
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1667
1671
  sig { params(bytes: String).returns(Fixnum) }
1668
1672
  def write(bytes); end
1669
1673
  end
@@ -1728,13 +1732,12 @@ end, T.untyped)
1728
1732
  # "IO-ish" things to also respond to `write`? This is what this module does.
1729
1733
  # Jim would be proud. We miss you, Jim.
1730
1734
  module WriteShovel
1731
- # sord infer - argument name in single @param inferred as "bytes"
1732
1735
  # Writes the given data to the output stream. Allows the object to be used as
1733
1736
  # a target for `IO.copy_stream(from, to)`
1734
1737
  #
1735
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1738
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1736
1739
  #
1737
- # _@return_ — the number of bytes written
1740
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1738
1741
  sig { params(bytes: String).returns(Fixnum) }
1739
1742
  def write(bytes); end
1740
1743
  end
@@ -1960,13 +1963,12 @@ end, T.untyped)
1960
1963
  sig { returns(T.untyped) }
1961
1964
  def tell; end
1962
1965
 
1963
- # sord infer - argument name in single @param inferred as "bytes"
1964
1966
  # Writes the given data to the output stream. Allows the object to be used as
1965
1967
  # a target for `IO.copy_stream(from, to)`
1966
1968
  #
1967
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1969
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1968
1970
  #
1969
- # _@return_ — the number of bytes written
1971
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1970
1972
  sig { params(bytes: String).returns(Fixnum) }
1971
1973
  def write(bytes); end
1972
1974
  end
@@ -1977,25 +1979,39 @@ end, T.untyped)
1977
1979
  # gets automatically forwarded to the Rails response stream. When the output completes,
1978
1980
  # the Rails response stream is going to be closed automatically.
1979
1981
  #
1982
+ # Note that there is an important difference in how this method works, depending whether
1983
+ # you use it in a controller which includes `ActionController::Live` vs. one that does not.
1984
+ # With a standard `ActionController` this method will assign a response body, but streaming
1985
+ # will begin when your action method returns. With `ActionController::Live` the streaming
1986
+ # will begin immediately, before the method returns. In all other aspects the method should
1987
+ # stream correctly in both types of controllers.
1988
+ #
1989
+ # If you encounter buffering (streaming does not start for a very long time) you probably
1990
+ # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
1991
+ # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
1992
+ # always possible. If you encounter buffering, examine your middleware stack and try to suss
1993
+ # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
1994
+ # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
1995
+ #
1980
1996
  # _@param_ `filename` — name of the file for the Content-Disposition header
1981
1997
  #
1982
1998
  # _@param_ `type` — the content type (MIME type) of the archive being output
1983
1999
  #
1984
2000
  # _@param_ `use_chunked_transfer_encoding` — whether to forcibly encode output as chunked. Normally you should not need this.
1985
2001
  #
1986
- # _@param_ `zip_streamer_options` — options that will be passed to the Streamer. See {ZipKit::Streamer#initialize} for the full list of options.
2002
+ # _@param_ `output_enumerator_options` — options that will be passed to the OutputEnumerator - these include options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
1987
2003
  #
1988
- # _@return_ — The output enumerator assigned to the response body
2004
+ # _@return_ — always returns true
1989
2005
  sig do
1990
2006
  params(
1991
2007
  filename: String,
1992
2008
  type: String,
1993
2009
  use_chunked_transfer_encoding: T::Boolean,
1994
- zip_streamer_options: T::Hash[T.untyped, T.untyped],
2010
+ output_enumerator_options: T::Hash[T.untyped, T.untyped],
1995
2011
  zip_streaming_blk: T.proc.params(the: ZipKit::Streamer).void
1996
- ).returns(ZipKit::OutputEnumerator)
2012
+ ).returns(T::Boolean)
1997
2013
  end
1998
- def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk); end
2014
+ def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk); end
1999
2015
  end
2000
2016
 
2001
2017
  # The output enumerator makes it possible to "pull" from a ZipKit streamer
@@ -2056,15 +2072,11 @@ end, T.untyped)
2056
2072
  # ...
2057
2073
  # end
2058
2074
  #
2059
- # _@param_ `kwargs_for_new` — keyword arguments for {Streamer.new}
2060
- #
2061
2075
  # _@param_ `streamer_options` — options for Streamer, see {ZipKit::Streamer.new}
2062
2076
  #
2063
- # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set it to 0 for unbuffered writes.
2077
+ # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
2064
2078
  #
2065
2079
  # _@param_ `blk` — a block that will receive the Streamer object when executing. The block will not be executed immediately but only once `each` is called on the OutputEnumerator
2066
- #
2067
- # _@return_ — the enumerator you can read bytestrings of the ZIP from by calling `each`
2068
2080
  sig { params(write_buffer_size: Integer, streamer_options: T::Hash[T.untyped, T.untyped], blk: T.untyped).void }
2069
2081
  def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk); end
2070
2082
 
@@ -2083,6 +2095,18 @@ end, T.untyped)
2083
2095
  def each; end
2084
2096
 
2085
2097
  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
2098
+ # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
2099
+ # particular response body getting served. You might want to override the headers with your particular
2100
+ # ones - for example, specific content types are needed for files which are, technically, ZIP files
2101
+ # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
2102
+ # and ePubs.
2103
+ sig { returns(T::Hash[T.untyped, T.untyped]) }
2104
+ def self.streaming_http_headers; end
2105
+
2106
+ # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
2107
+ # presized responses, but is now effectively a no-op.
2108
+ #
2109
+ # _@see_ `[ZipKit::OutputEnumerator.streaming_http_headers]`
2086
2110
  sig { returns(T::Hash[T.untyped, T.untyped]) }
2087
2111
  def streaming_http_headers; end
2088
2112
 
data/zip_kit.gemspec CHANGED
@@ -7,7 +7,7 @@ Gem::Specification.new do |spec|
7
7
  spec.version = ZipKit::VERSION
8
8
  spec.authors = ["Julik Tarkhanov", "Noah Berman", "Dmitry Tymchuk", "David Bosveld", "Felix Bünemann"]
9
9
  spec.email = ["me@julik.nl"]
10
-
10
+ spec.license = "MIT"
11
11
  spec.summary = "Stream out ZIP files from Ruby. Successor to zip_tricks."
12
12
  spec.description = "Stream out ZIP files from Ruby. Successor to zip_tricks."
13
13
  spec.homepage = "https://github.com/julik/zip_kit"
@@ -23,9 +23,12 @@ Gem::Specification.new do |spec|
23
23
  spec.require_paths = ["lib"]
24
24
 
25
25
  spec.add_development_dependency "bundler"
26
- spec.add_development_dependency "rubyzip", "~> 1"
27
26
 
28
- spec.add_development_dependency "rack" # For tests where we spin up a server
27
+ # zip_kit does not use any runtime dependencies (besides zlib). However, for testing
28
+ # things quite a few things are used - and for a good reason.
29
+
30
+ spec.add_development_dependency "rubyzip", "~> 1" # We test our output with _another_ ZIP library, which is the way to go here
31
+ spec.add_development_dependency "rack" # For tests where we spin up a server. Both for streaming out and for testing reads over HTTP
29
32
  spec.add_development_dependency "rake", "~> 12.2"
30
33
  spec.add_development_dependency "rspec", "~> 3"
31
34
  spec.add_development_dependency "rspec-mocks", "~> 3.10", ">= 3.10.2" # ruby 3 compatibility
@@ -39,5 +42,6 @@ Gem::Specification.new do |spec|
39
42
  spec.add_development_dependency "puma"
40
43
  spec.add_development_dependency "actionpack", "~> 5" # For testing RailsStreaming against an actual Rails controller
41
44
  spec.add_development_dependency "nokogiri", "~> 1", ">= 1.13" # Rails 5 does by mistake use an older Nokogiri otherwise
45
+ spec.add_development_dependency "sinatra"
42
46
  spec.add_development_dependency "sord"
43
47
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: zip_kit
3
3
  version: !ruby/object:Gem::Version
4
- version: 6.2.0
4
+ version: 6.2.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Julik Tarkhanov
@@ -12,7 +12,7 @@ authors:
12
12
  autorequire:
13
13
  bindir: exe
14
14
  cert_chain: []
15
- date: 2024-03-11 00:00:00.000000000 Z
15
+ date: 2024-03-27 00:00:00.000000000 Z
16
16
  dependencies:
17
17
  - !ruby/object:Gem::Dependency
18
18
  name: bundler
@@ -250,6 +250,20 @@ dependencies:
250
250
  - - ">="
251
251
  - !ruby/object:Gem::Version
252
252
  version: '1.13'
253
+ - !ruby/object:Gem::Dependency
254
+ name: sinatra
255
+ requirement: !ruby/object:Gem::Requirement
256
+ requirements:
257
+ - - ">="
258
+ - !ruby/object:Gem::Version
259
+ version: '0'
260
+ type: :development
261
+ prerelease: false
262
+ version_requirements: !ruby/object:Gem::Requirement
263
+ requirements:
264
+ - - ">="
265
+ - !ruby/object:Gem::Version
266
+ version: '0'
253
267
  - !ruby/object:Gem::Dependency
254
268
  name: sord
255
269
  requirement: !ruby/object:Gem::Requirement
@@ -292,6 +306,7 @@ files:
292
306
  - examples/parallel_compression_with_block_deflate.rb
293
307
  - examples/rack_application.rb
294
308
  - examples/s3_upload.rb
309
+ - examples/sinatra_application.rb
295
310
  - lib/zip_kit.rb
296
311
  - lib/zip_kit/block_deflate.rb
297
312
  - lib/zip_kit/block_write.rb
@@ -324,7 +339,8 @@ files:
324
339
  - rbi/zip_kit.rbi
325
340
  - zip_kit.gemspec
326
341
  homepage: https://github.com/julik/zip_kit
327
- licenses: []
342
+ licenses:
343
+ - MIT
328
344
  metadata:
329
345
  allowed_push_host: https://rubygems.org
330
346
  post_install_message:
@@ -342,7 +358,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
342
358
  - !ruby/object:Gem::Version
343
359
  version: '0'
344
360
  requirements: []
345
- rubygems_version: 3.3.7
361
+ rubygems_version: 3.1.6
346
362
  signing_key:
347
363
  specification_version: 4
348
364
  summary: Stream out ZIP files from Ruby. Successor to zip_tricks.