zip_kit 6.2.0 → 6.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 14382b872a41cb63ba80b664d0b42135b8105aabbf42f2813716d3b98c1f4ff5
4
- data.tar.gz: c78ff42650fba09aa01854a91cf491f51ab307413c01b8cd7c0f0ccb0d8954cb
3
+ metadata.gz: e1136ebba851638486c9e47150a8d706c49a2bbc0074f457c794582d8ce19089
4
+ data.tar.gz: 80de3edcb5bc748aaf855a7bf0b1f19439522c8efa4b97754f813fe9413bac2c
5
5
  SHA512:
6
- metadata.gz: 41a91eda762ca8668fe2696746367ade01b9029f03056f8d9da93b6dfb1f811d4eaec7b1159015287128db1ef94382a2776bdac86872cbe642a087c46154b450
7
- data.tar.gz: 676b8fd3e58f255087731cc209249bbba6e9ab8f87269cb182fc3ed62664d0c1a4ae14a51415fb4c9fc5f8674182a795d8f5103a57fb2b5a5ba28441948fa66e
6
+ metadata.gz: 20c5922a4178f2068a4f06388b201bd263f01c387d308c2c6297feba1c05385d601072cae0451d59a4a0b4e1ba1e354a6fa7f622ff1b58daf70947e6991b1e82
7
+ data.tar.gz: c373972ec6980000b40d1808247759b44f317a7aa3795b406e02005412cf0687e0f2311e5809f011eb1fbc19e6b2b7eb2a6a8f036cafe27a2645f6476cf0c441
data/CHANGELOG.md CHANGED
@@ -1,3 +1,15 @@
1
+ ## 6.2.2
2
+
3
+ * Make sure "zlib" gets required at the top, as it is used everywhere
4
+ * Improve documentation
5
+ * Make sure `zip_kit_stream` honors the custom `Content-Type` parameter
6
+ * Add a streaming example with Sinatra (and add a Sinatra app to the test harness)
7
+
8
+ ## 6.2.1
9
+
10
+ * Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
11
+ * Make `BlockWrite` respond to `write` in addition to `<<`
12
+
1
13
  ## 6.2.0
2
14
 
3
15
  * Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
@@ -149,7 +161,7 @@
149
161
  ## 4.4.2
150
162
 
151
163
  * Add 2.4 to Travis rubies
152
- * Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_kit/pull/14)
164
+ * Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
153
165
 
154
166
  ## 4.4.1
155
167
 
data/CONTRIBUTING.md CHANGED
@@ -106,11 +106,11 @@ project:
106
106
 
107
107
  ```bash
108
108
  # Clone your fork of the repo into the current directory
109
- git clone git@github.com:WeTransfer/zip_kit.git
109
+ git clone git@github.com:julik/zip_kit.git
110
110
  # Navigate to the newly cloned directory
111
111
  cd zip_kit
112
112
  # Assign the original repo to a remote called "upstream"
113
- git remote add upstream git@github.com:WeTransfer/zip_kit.git
113
+ git remote add upstream git@github.com:julik/zip_kit.git
114
114
  ```
115
115
 
116
116
  2. If you cloned a while ago, get the latest changes from upstream:
data/README.md CHANGED
@@ -55,11 +55,11 @@ If you want some more conveniences you can also use [zipline](https://github.com
55
55
  will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
56
56
  via HTTP.
57
57
 
58
- `RailsStreaming` will *not* use [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
59
- and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
60
- RSpec) should work normally with controller actions returning ZIPs.
58
+ `RailsStreaming` does *not* require [ActionController::Live](https://api.rubyonrails.org/classes/ActionController/Live.html)
59
+ and will stream without it. See {ZipKit::RailsStreaming#zip_kit_stream} for more details on this. You can use it
60
+ together with `Live` just fine if you need to.
61
61
 
62
- ## Writing into other streaming destinations and through streaming wrappers
62
+ ## Writing into streaming destinations
63
63
 
64
64
  Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
65
65
  is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
@@ -69,23 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
69
69
  obj = bucket.object("big.zip")
70
70
  obj.upload_stream do |write_stream|
71
71
  ZipKit::Streamer.open(write_stream) do |zip|
72
- zip.write_file("large.csv") do |sink|
73
- CSV(sink) do |csv|
74
- csv << ["Line", "Item"]
75
- 20_000.times do |n|
76
- csv << [n, "Item number #{n}"]
77
- end
72
+ zip.write_file("file.csv") do |sink|
73
+ File.open("large.csv", "rb") do |file_input|
74
+ IO.copy_stream(file_input, sink)
78
75
  end
79
76
  end
80
77
  end
81
78
  end
82
79
  ```
83
80
 
81
+ ## Writing through streaming wrappers
82
+
84
83
  Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
85
- output with [builder](https://github.com/jimweirich/builder#project-builder)
84
+ output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
85
+ every time a complete write call is done:
86
86
 
87
87
  ```ruby
88
- zip.write_file('report1.csv') do |sink|
88
+ zip.write_file('employees.xml') do |sink|
89
89
  builder = Builder::XmlMarkup.new(target: sink, indent: 2)
90
90
  builder.people do
91
91
  Person.all.find_each do |person|
@@ -95,14 +95,30 @@ zip.write_file('report1.csv') do |sink|
95
95
  end
96
96
  ```
97
97
 
98
- and this output will be compressed and output into the ZIP file on the fly. zip_kit composes with any
99
- Ruby code that streams its output into a destination.
98
+ The output will be compressed and output into the ZIP file on the fly. Same for CSV:
100
99
 
101
- ## Create a ZIP file without size estimation, compress on-the-fly during writes
100
+ ```ruby
101
+ zip.write_file('line_items.csv') do |sink|
102
+ CSV(sink) do |csv|
103
+ csv << ["Line", "Item"]
104
+ 20_000.times do |n|
105
+ csv << [n, "Item number #{n}"]
106
+ end
107
+ end
108
+ end
109
+ ```
110
+
111
+ ## Automatic storage mode (stored vs. deflated)
112
+
113
+ The ZIP file format allows storage in both compressed and raw storage modes. The raw ("stored")
114
+ mode does not require decompression and unarchives faster.
102
115
 
103
- Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
104
- memory inflation is going to be very constrained. Data will be written to destination at fairly regular
105
- intervals. Deflate compression will work best for things like text files. For example, here is how to
116
+ ZipKit will buffer a small amount of output and attempt to compress it using deflate compression.
117
+ If this turns out to be significantly smaller than raw data, it is then going to proceed with
118
+ all further output using deflate compression. Memory use is going to be very modest, but it allows
119
+ you to not have to think about the appropriate storage mode.
120
+
121
+ Deflate compression will work great for JSONs, CSVs and other text- or text-like formats. For example, here is how to
106
122
  output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
107
123
 
108
124
  ```ruby
@@ -116,18 +132,16 @@ ZipKit::Streamer.open($stdout) do |zip|
116
132
  end
117
133
  ```
118
134
 
119
- Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
120
- since you do not know how large the compressed data segments are going to be.
135
+ If you want to use specific storage modes, use `write_deflated_file` and `write_stored_file` instead of
136
+ `write_file`.
121
137
 
122
138
  ## Send a ZIP from a Rack response
123
139
 
124
140
  zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
125
- by piece, and apply some amount of buffering as well. Note that you might want to wrap
126
- it with a chunked transfer encoder - the `to_rack_response_headers_and_body` method will do
127
- that for you. Return the headers and the body to your webserver and you will have your ZIP streamed!
128
- The block that you give to the `OutputEnumerator` receive the {ZipKit::Streamer} object and will only
129
- start executing once your response body starts getting iterated over - when actually sending
130
- the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
141
+ by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
142
+ and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
143
+ the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
144
+ over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
131
145
 
132
146
  ```ruby
133
147
  body = ZipKit::OutputEnumerator.new do | zip |
@@ -139,13 +153,16 @@ body = ZipKit::OutputEnumerator.new do | zip |
139
153
  end
140
154
  end
141
155
 
142
- headers, streaming_body = body.to_rack_response_headers_and_body(env)
143
- [200, headers, streaming_body]
156
+ [200, body.streaming_http_headers, body]
144
157
  ```
145
158
 
146
159
  ## Send a ZIP file of known size, with correct headers
147
160
 
148
- Use the `SizeEstimator` to compute the correct size of the resulting archive.
161
+ Sending a file with data descriptors is not always desirable - you don't really know how large your ZIP is going to be.
162
+ If you want to present your users with proper download progress, you would need to set a `Content-Length` header - and
163
+ know ahead of time how large your download is going to be. This can be done with ZipKit, provided you know how large
164
+ the compressed versions of your file are going to be. Use the {ZipKit::SizeEstimator} to do the pre-calculation - it
165
+ is not going to produce any large amounts of output, and will give you a to-the-byte value for your future archive:
149
166
 
150
167
  ```ruby
151
168
  bytesize = ZipKit::SizeEstimator.estimate do |z|
@@ -160,8 +177,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
160
177
  zip << read_file('myfile2.bin')
161
178
  end
162
179
 
163
- headers, streaming_body = body.to_rack_response_headers_and_body(env, content_length: bytesize)
164
- [200, headers, streaming_body]
180
+ hh = zip_body.streaming_http_headers
181
+ hh["Content-Length"] = bytesize.to_s
182
+
183
+ [200, hh, zip_body]
165
184
  ```
166
185
 
167
186
  ## Writing ZIP files using the Streamer bypass
@@ -0,0 +1,16 @@
1
+ require "sinatra/base"
2
+
3
+ class SinatraApp < Sinatra::Base
4
+ get "/" do
5
+ content_type :zip
6
+ stream do |out|
7
+ ZipKit::Streamer.open(out) do |z|
8
+ z.write_file(File.basename(__FILE__)) do |io|
9
+ File.open(__FILE__, "r") do |f|
10
+ IO.copy_stream(f, io)
11
+ end
12
+ end
13
+ end
14
+ end
15
+ end
16
+ end
@@ -1,7 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require "zlib"
4
-
5
3
  # Permits Deflate compression in independent blocks. The workflow is as follows:
6
4
  #
7
5
  # * Run every block to compress through deflate_chunk, remove the header,
@@ -17,9 +17,12 @@
17
17
  # end
18
18
  # [200, {}, MyRackResponse.new]
19
19
  class ZipKit::BlockWrite
20
+ include ZipKit::WriteShovel
21
+
20
22
  # Creates a new BlockWrite.
21
23
  #
22
24
  # @param block The block that will be called when this object receives the `<<` message
25
+ # @yieldparam bytes[String] A string in binary encoding which has just been written into the object
23
26
  def initialize(&block)
24
27
  @block = block
25
28
  end
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
36
39
  # @param buf[String] the string to write. Note that a zero-length String
37
40
  # will not be forwarded to the block, as it has special meaning when used
38
41
  # with chunked encoding (it indicates the end of the stream).
39
- # @return self
42
+ # @return [ZipKit::BlockWrite]
40
43
  def <<(buf)
41
44
  # Zero-size output has a special meaning when using chunked encoding
42
45
  return if buf.nil? || buf.bytesize.zero?
@@ -34,6 +34,15 @@ require "time" # for .httpdate
34
34
  #
35
35
  # to bypass things like `Rack::ETag` and the nginx buffering.
36
36
  class ZipKit::OutputEnumerator
37
+ # With HTTP output it is better to apply a small amount of buffering. While Streamer
38
+ # output does not buffer at all, the `OutputEnumerator` does as it is going to
39
+ # be used as a Rack response body. Applying some buffering helps reduce the number
40
+ # of syscalls for otherwise tiny writes, which relieves the app webserver from
41
+ # doing too much work managing those writes. While we recommend buffering, the
42
+ # buffer size is configurable via the constructor - so you can disable buffering
43
+ # if you really need to. While ZipKit ams not to buffer, in this instance this
44
+ # buffering is justified. See https://github.com/WeTransfer/zip_tricks/issues/78
45
+ # for the background on buffering.
37
46
  DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
38
47
 
39
48
  # Creates a new OutputEnumerator enumerator. The enumerator can be read from using `each`,
@@ -60,14 +69,11 @@ class ZipKit::OutputEnumerator
60
69
  # ...
61
70
  # end
62
71
  #
63
- # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
64
- # @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
65
- #
66
72
  # @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
67
73
  # @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
68
74
  # it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
69
75
  # object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
70
- # but at block size boundaries or greater). Set it to 0 for unbuffered writes.
76
+ # but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
71
77
  # @param blk a block that will receive the Streamer object when executing. The block will not be executed
72
78
  # immediately but only once `each` is called on the OutputEnumerator
73
79
  def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
@@ -100,9 +106,14 @@ class ZipKit::OutputEnumerator
100
106
  end
101
107
 
102
108
  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
109
+ # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
110
+ # particular response body getting served. You might want to override the headers with your particular
111
+ # ones - for example, specific content types are needed for files which are, technically, ZIP files
112
+ # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
113
+ # and ePubs.
103
114
  #
104
115
  # @return [Hash]
105
- def streaming_http_headers
116
+ def self.streaming_http_headers
106
117
  _headers = {
107
118
  # We need to ensure Rack::ETag does not suddenly start buffering us, see
108
119
  # https://github.com/rack/rack/issues/1619#issuecomment-606315714
@@ -121,6 +132,15 @@ class ZipKit::OutputEnumerator
121
132
  }
122
133
  end
123
134
 
135
+ # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
136
+ # presized responses, but is now effectively a no-op.
137
+ #
138
+ # @see [ZipKit::OutputEnumerator.streaming_http_headers]
139
+ # @return [Hash]
140
+ def streaming_http_headers
141
+ self.class.streaming_http_headers
142
+ end
143
+
124
144
  # Returns a tuple of `headers, body` - headers are a `Hash` and the body is
125
145
  # an object that can be used as a Rack response body. This method used to accept arguments
126
146
  # but will now just ignore them.
@@ -5,18 +5,29 @@ module ZipKit::RailsStreaming
5
5
  # Opens a {ZipKit::Streamer} and yields it to the caller. The output of the streamer
6
6
  # gets automatically forwarded to the Rails response stream. When the output completes,
7
7
  # the Rails response stream is going to be closed automatically.
8
+ #
9
+ # Note that there is an important difference in how this method works, depending whether
10
+ # you use it in a controller which includes `ActionController::Live` vs. one that does not.
11
+ # With a standard `ActionController` this method will assign a response body, but streaming
12
+ # will begin when your action method returns. With `ActionController::Live` the streaming
13
+ # will begin immediately, before the method returns. In all other aspects the method should
14
+ # stream correctly in both types of controllers.
15
+ #
16
+ # If you encounter buffering (streaming does not start for a very long time) you probably
17
+ # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
18
+ # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
19
+ # always possible. If you encounter buffering, examine your middleware stack and try to suss
20
+ # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
21
+ # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
22
+ #
8
23
  # @param filename[String] name of the file for the Content-Disposition header
9
24
  # @param type[String] the content type (MIME type) of the archive being output
10
25
  # @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
11
- # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
12
- # See {ZipKit::Streamer#initialize} for the full list of options.
26
+ # @param output_enumerator_options[Hash] options that will be passed to the OutputEnumerator - these include
27
+ # options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
13
28
  # @yieldparam [ZipKit::Streamer] the streamer that can be written to
14
- # @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
15
- def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
16
- # The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
17
- # first will also validate the Streamer options.
18
- output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
19
-
29
+ # @return [Boolean] always returns true
30
+ def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk)
20
31
  # We want some common headers for file sending. Rails will also set
21
32
  # self.sending_file = true for us when we call send_file_headers!
22
33
  send_file_headers!(type: type, filename: filename)
@@ -29,16 +40,46 @@ module ZipKit::RailsStreaming
29
40
  logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
30
41
  end
31
42
 
32
- headers = output_enum.streaming_http_headers
43
+ headers = ZipKit::OutputEnumerator.streaming_http_headers
44
+
45
+ # Allow Rails headers to override ours. This is important if, for example, a content type gets
46
+ # set to something else than "application/zip"
47
+ response.headers.reverse_merge!(headers)
48
+
49
+ # The output enumerator yields chunks of bytes generated from the Streamer,
50
+ # with some buffering. See OutputEnumerator docs for more.
51
+ rack_zip_body = ZipKit::OutputEnumerator.new(**output_enumerator_options, &zip_streaming_blk)
33
52
 
34
- # In rare circumstances (such as the app using Rack::ContentLength - which should normally
35
- # not be used allow the user to force the use of the chunked encoding
53
+ # Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
54
+ # Rack::ContentLength is normally not in a Rails middleware stack, but it might get
55
+ # introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
56
+ # There is a recommendation to leave the chunked encoding to the app server, so that servers
57
+ # that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
58
+ # see https://github.com/julik/zip_kit/issues/7
59
+ # But it is not to be excluded that a user may need to force the chunked encoding to bypass
60
+ # some especially pesky Rack middleware that just would not cooperate. Those include
61
+ # Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
36
62
  if use_chunked_transfer_encoding
37
- output_enum = ZipKit::RackChunkedBody.new(output_enum)
38
- headers["Transfer-Encoding"] = "chunked"
63
+ response.headers["Transfer-Encoding"] = "chunked"
64
+ rack_zip_body = ZipKit::RackChunkedBody.new(rack_zip_body)
65
+ end
66
+
67
+ # Time for some branching, which mostly has to do with the 999 flavours of
68
+ # "how to make both Rails and Rack stream"
69
+ if self.class.ancestors.include?(ActionController::Live)
70
+ # If this controller includes Live it will not work correctly with a Rack
71
+ # response body assignment - the action will just hang. We need to read out the response
72
+ # body ourselves and write it into the Rails stream.
73
+ begin
74
+ rack_zip_body.each { |bytes| response.stream.write(bytes) }
75
+ ensure
76
+ response.stream.close
77
+ end
78
+ else
79
+ # Stream using a Rack body assigned to the ActionController response body
80
+ self.response_body = rack_zip_body
39
81
  end
40
82
 
41
- response.headers.merge!(headers)
42
- self.response_body = output_enum
83
+ true
43
84
  end
44
85
  end
@@ -2,19 +2,19 @@
2
2
 
3
3
  require "set"
4
4
 
5
- # Is used to write streamed ZIP archives into the provided IO-ish object.
6
- # The output IO is never going to be rewound or seeked, so the output
7
- # of this object can be coupled directly to, say, a Rack output. The
8
- # output can also be a String, Array or anything that responds to `<<`.
5
+ # Is used to write ZIP archives without having to read them back or to overwrite
6
+ # data. It outputs into any object that supports `<<` or `write`, namely:
9
7
  #
10
- # Allows for splicing raw files (for "stored" entries without compression)
11
- # and splicing of deflated files (for "deflated" storage mode).
8
+ # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
9
+ # for the `Streamer`.
12
10
  #
13
- # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
14
- # before the writing of the entry body starts.
11
+ # You can also combine output through the `Streamer` with direct output to the destination,
12
+ # all while preserving the correct offsets in the ZIP file structures. This allows usage
13
+ # of `sendfile()` or socket `splice()` calls for "through" proxying.
15
14
  #
16
- # Any object that responds to `<<` can be used as the Streamer target - you can use
17
- # a String, an Array, a Socket or a File, at your leisure.
15
+ # If you want to avoid data descriptors - or write data bypassing the Streamer -
16
+ # you need to know the CRC32 (as a uint) and the filesize upfront,
17
+ # before the writing of the entry body starts.
18
18
  #
19
19
  # ## Using the Streamer with runtime compression
20
20
  #
@@ -34,7 +34,7 @@ require "set"
34
34
  # end
35
35
  # end
36
36
  #
37
- # The central directory will be written automatically at the end of the block.
37
+ # The central directory will be written automatically at the end of the `open` block.
38
38
  #
39
39
  # ## Using the Streamer with entries of known size and having a known CRC32 checksum
40
40
  #
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ZipKit
4
- VERSION = "6.2.0"
4
+ VERSION = "6.2.2"
5
5
  end
@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
13
13
  # Writes the given data to the output stream. Allows the object to be used as
14
14
  # a target for `IO.copy_stream(from, to)`
15
15
  #
16
- # @param d[String] the binary string to write (part of the uncompressed file)
17
- # @return [Fixnum] the number of bytes written
16
+ # @param bytes[String] the binary string to write (part of the uncompressed file)
17
+ # @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
18
18
  def write(bytes)
19
19
  self << bytes
20
20
  bytes.bytesize
data/lib/zip_kit.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require_relative "zip_kit/version"
4
+ require "zlib"
4
5
 
5
6
  module ZipKit
6
7
  autoload :OutputEnumerator, File.dirname(__FILE__) + "/zip_kit/rack_body.rb"
data/rbi/zip_kit.rbi CHANGED
@@ -1,6 +1,6 @@
1
1
  # typed: strong
2
2
  module ZipKit
3
- VERSION = T.let("6.2.0", T.untyped)
3
+ VERSION = T.let("6.2.2", T.untyped)
4
4
 
5
5
  # A ZIP archive contains a flat list of entries. These entries can implicitly
6
6
  # create directories when the archive is expanded. For example, an entry with
@@ -100,19 +100,19 @@ module ZipKit
100
100
  end
101
101
  end
102
102
 
103
- # Is used to write streamed ZIP archives into the provided IO-ish object.
104
- # The output IO is never going to be rewound or seeked, so the output
105
- # of this object can be coupled directly to, say, a Rack output. The
106
- # output can also be a String, Array or anything that responds to `<<`.
103
+ # Is used to write ZIP archives without having to read them back or to overwrite
104
+ # data. It outputs into any object that supports `<<` or `write`, namely:
107
105
  #
108
- # Allows for splicing raw files (for "stored" entries without compression)
109
- # and splicing of deflated files (for "deflated" storage mode).
106
+ # An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
107
+ # for the `Streamer`.
110
108
  #
111
- # For stored entries, you need to know the CRC32 (as a uint) and the filesize upfront,
112
- # before the writing of the entry body starts.
109
+ # You can also combine output through the `Streamer` with direct output to the destination,
110
+ # all while preserving the correct offsets in the ZIP file structures. This allows usage
111
+ # of `sendfile()` or socket `splice()` calls for "through" proxying.
113
112
  #
114
- # Any object that responds to `<<` can be used as the Streamer target - you can use
115
- # a String, an Array, a Socket or a File, at your leisure.
113
+ # If you want to avoid data descriptors - or write data bypassing the Streamer -
114
+ # you need to know the CRC32 (as a uint) and the filesize upfront,
115
+ # before the writing of the entry body starts.
116
116
  #
117
117
  # ## Using the Streamer with runtime compression
118
118
  #
@@ -132,7 +132,7 @@ module ZipKit
132
132
  # end
133
133
  # end
134
134
  #
135
- # The central directory will be written automatically at the end of the block.
135
+ # The central directory will be written automatically at the end of the `open` block.
136
136
  #
137
137
  # ## Using the Streamer with entries of known size and having a known CRC32 checksum
138
138
  #
@@ -563,13 +563,12 @@ module ZipKit
563
563
  sig { params(filename: T.untyped).returns(T.untyped) }
564
564
  def remove_backslash(filename); end
565
565
 
566
- # sord infer - argument name in single @param inferred as "bytes"
567
566
  # Writes the given data to the output stream. Allows the object to be used as
568
567
  # a target for `IO.copy_stream(from, to)`
569
568
  #
570
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
569
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
571
570
  #
572
- # _@return_ — the number of bytes written
571
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
573
572
  sig { params(bytes: String).returns(Fixnum) }
574
573
  def write(bytes); end
575
574
 
@@ -678,13 +677,12 @@ module ZipKit
678
677
  sig { returns(T.untyped) }
679
678
  def close; end
680
679
 
681
- # sord infer - argument name in single @param inferred as "bytes"
682
680
  # Writes the given data to the output stream. Allows the object to be used as
683
681
  # a target for `IO.copy_stream(from, to)`
684
682
  #
685
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
683
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
686
684
  #
687
- # _@return_ — the number of bytes written
685
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
688
686
  sig { params(bytes: String).returns(Fixnum) }
689
687
  def write(bytes); end
690
688
  end
@@ -748,13 +746,12 @@ module ZipKit
748
746
  sig { returns(T::Hash[T.untyped, T.untyped]) }
749
747
  def finish; end
750
748
 
751
- # sord infer - argument name in single @param inferred as "bytes"
752
749
  # Writes the given data to the output stream. Allows the object to be used as
753
750
  # a target for `IO.copy_stream(from, to)`
754
751
  #
755
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
752
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
756
753
  #
757
- # _@return_ — the number of bytes written
754
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
758
755
  sig { params(bytes: String).returns(Fixnum) }
759
756
  def write(bytes); end
760
757
  end
@@ -787,13 +784,12 @@ module ZipKit
787
784
  sig { returns(T::Hash[T.untyped, T.untyped]) }
788
785
  def finish; end
789
786
 
790
- # sord infer - argument name in single @param inferred as "bytes"
791
787
  # Writes the given data to the output stream. Allows the object to be used as
792
788
  # a target for `IO.copy_stream(from, to)`
793
789
  #
794
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
790
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
795
791
  #
796
- # _@return_ — the number of bytes written
792
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
797
793
  sig { params(bytes: String).returns(Fixnum) }
798
794
  def write(bytes); end
799
795
  end
@@ -1107,19 +1103,28 @@ end, T.untyped)
1107
1103
  # end
1108
1104
  # [200, {}, MyRackResponse.new]
1109
1105
  class BlockWrite
1106
+ include ZipKit::WriteShovel
1107
+
1110
1108
  # Creates a new BlockWrite.
1111
1109
  #
1112
1110
  # _@param_ `block` — The block that will be called when this object receives the `<<` message
1113
- sig { params(block: T.untyped).void }
1111
+ sig { params(block: T.proc.params(bytes: String).void).void }
1114
1112
  def initialize(&block); end
1115
1113
 
1116
1114
  # Sends a string through to the block stored in the BlockWrite.
1117
1115
  #
1118
1116
  # _@param_ `buf` — the string to write. Note that a zero-length String will not be forwarded to the block, as it has special meaning when used with chunked encoding (it indicates the end of the stream).
1119
- #
1120
- # _@return_ — self
1121
- sig { params(buf: String).returns(T.untyped) }
1117
+ sig { params(buf: String).returns(ZipKit::BlockWrite) }
1122
1118
  def <<(buf); end
1119
+
1120
+ # Writes the given data to the output stream. Allows the object to be used as
1121
+ # a target for `IO.copy_stream(from, to)`
1122
+ #
1123
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1124
+ #
1125
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1126
+ sig { params(bytes: String).returns(Fixnum) }
1127
+ def write(bytes); end
1123
1128
  end
1124
1129
 
1125
1130
  # A very barebones ZIP file reader. Is made for maximum interoperability, but at the same
@@ -1657,13 +1662,12 @@ end, T.untyped)
1657
1662
  sig { params(crc32: Fixnum, blob_size: Fixnum).returns(Fixnum) }
1658
1663
  def append(crc32, blob_size); end
1659
1664
 
1660
- # sord infer - argument name in single @param inferred as "bytes"
1661
1665
  # Writes the given data to the output stream. Allows the object to be used as
1662
1666
  # a target for `IO.copy_stream(from, to)`
1663
1667
  #
1664
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1668
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1665
1669
  #
1666
- # _@return_ — the number of bytes written
1670
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1667
1671
  sig { params(bytes: String).returns(Fixnum) }
1668
1672
  def write(bytes); end
1669
1673
  end
@@ -1728,13 +1732,12 @@ end, T.untyped)
1728
1732
  # "IO-ish" things to also respond to `write`? This is what this module does.
1729
1733
  # Jim would be proud. We miss you, Jim.
1730
1734
  module WriteShovel
1731
- # sord infer - argument name in single @param inferred as "bytes"
1732
1735
  # Writes the given data to the output stream. Allows the object to be used as
1733
1736
  # a target for `IO.copy_stream(from, to)`
1734
1737
  #
1735
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1738
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1736
1739
  #
1737
- # _@return_ — the number of bytes written
1740
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1738
1741
  sig { params(bytes: String).returns(Fixnum) }
1739
1742
  def write(bytes); end
1740
1743
  end
@@ -1960,13 +1963,12 @@ end, T.untyped)
1960
1963
  sig { returns(T.untyped) }
1961
1964
  def tell; end
1962
1965
 
1963
- # sord infer - argument name in single @param inferred as "bytes"
1964
1966
  # Writes the given data to the output stream. Allows the object to be used as
1965
1967
  # a target for `IO.copy_stream(from, to)`
1966
1968
  #
1967
- # _@param_ `d` — the binary string to write (part of the uncompressed file)
1969
+ # _@param_ `bytes` — the binary string to write (part of the uncompressed file)
1968
1970
  #
1969
- # _@return_ — the number of bytes written
1971
+ # _@return_ — the number of bytes written (will always be the bytesize of `bytes`)
1970
1972
  sig { params(bytes: String).returns(Fixnum) }
1971
1973
  def write(bytes); end
1972
1974
  end
@@ -1977,25 +1979,39 @@ end, T.untyped)
1977
1979
  # gets automatically forwarded to the Rails response stream. When the output completes,
1978
1980
  # the Rails response stream is going to be closed automatically.
1979
1981
  #
1982
+ # Note that there is an important difference in how this method works, depending whether
1983
+ # you use it in a controller which includes `ActionController::Live` vs. one that does not.
1984
+ # With a standard `ActionController` this method will assign a response body, but streaming
1985
+ # will begin when your action method returns. With `ActionController::Live` the streaming
1986
+ # will begin immediately, before the method returns. In all other aspects the method should
1987
+ # stream correctly in both types of controllers.
1988
+ #
1989
+ # If you encounter buffering (streaming does not start for a very long time) you probably
1990
+ # have a piece of Rack middleware in your stack which buffers. Known offenders are `Rack::ContentLength`,
1991
+ # `Rack::MiniProfiler` and `Rack::ETag`. ZipKit will try to work around these but it is not
1992
+ # always possible. If you encounter buffering, examine your middleware stack and try to suss
1993
+ # out whether any middleware might be buffering. You can also try setting `use_chunked_transfer_encoding`
1994
+ # to `true` - this is not recommended but sometimes necessary, for example to bypass `Rack::ContentLength`.
1995
+ #
1980
1996
  # _@param_ `filename` — name of the file for the Content-Disposition header
1981
1997
  #
1982
1998
  # _@param_ `type` — the content type (MIME type) of the archive being output
1983
1999
  #
1984
2000
  # _@param_ `use_chunked_transfer_encoding` — whether to forcibly encode output as chunked. Normally you should not need this.
1985
2001
  #
1986
- # _@param_ `zip_streamer_options` — options that will be passed to the Streamer. See {ZipKit::Streamer#initialize} for the full list of options.
2002
+ # _@param_ `output_enumerator_options` — options that will be passed to the OutputEnumerator - these include options for the Streamer. See {ZipKit::OutputEnumerator#initialize} for the full list of options.
1987
2003
  #
1988
- # _@return_ — The output enumerator assigned to the response body
2004
+ # _@return_ — always returns true
1989
2005
  sig do
1990
2006
  params(
1991
2007
  filename: String,
1992
2008
  type: String,
1993
2009
  use_chunked_transfer_encoding: T::Boolean,
1994
- zip_streamer_options: T::Hash[T.untyped, T.untyped],
2010
+ output_enumerator_options: T::Hash[T.untyped, T.untyped],
1995
2011
  zip_streaming_blk: T.proc.params(the: ZipKit::Streamer).void
1996
- ).returns(ZipKit::OutputEnumerator)
2012
+ ).returns(T::Boolean)
1997
2013
  end
1998
- def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk); end
2014
+ def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **output_enumerator_options, &zip_streaming_blk); end
1999
2015
  end
2000
2016
 
2001
2017
  # The output enumerator makes it possible to "pull" from a ZipKit streamer
@@ -2056,15 +2072,11 @@ end, T.untyped)
2056
2072
  # ...
2057
2073
  # end
2058
2074
  #
2059
- # _@param_ `kwargs_for_new` — keyword arguments for {Streamer.new}
2060
- #
2061
2075
  # _@param_ `streamer_options` — options for Streamer, see {ZipKit::Streamer.new}
2062
2076
  #
2063
- # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set it to 0 for unbuffered writes.
2077
+ # _@param_ `write_buffer_size` — By default all ZipKit writes are unbuffered. For output to sockets it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
2064
2078
  #
2065
2079
  # _@param_ `blk` — a block that will receive the Streamer object when executing. The block will not be executed immediately but only once `each` is called on the OutputEnumerator
2066
- #
2067
- # _@return_ — the enumerator you can read bytestrings of the ZIP from by calling `each`
2068
2080
  sig { params(write_buffer_size: Integer, streamer_options: T::Hash[T.untyped, T.untyped], blk: T.untyped).void }
2069
2081
  def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk); end
2070
2082
 
@@ -2083,6 +2095,18 @@ end, T.untyped)
2083
2095
  def each; end
2084
2096
 
2085
2097
  # Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
2098
+ # This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
2099
+ # particular response body getting served. You might want to override the headers with your particular
2100
+ # ones - for example, specific content types are needed for files which are, technically, ZIP files
2101
+ # but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
2102
+ # and ePubs.
2103
+ sig { returns(T::Hash[T.untyped, T.untyped]) }
2104
+ def self.streaming_http_headers; end
2105
+
2106
+ # Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
2107
+ # presized responses, but is now effectively a no-op.
2108
+ #
2109
+ # _@see_ `[ZipKit::OutputEnumerator.streaming_http_headers]`
2086
2110
  sig { returns(T::Hash[T.untyped, T.untyped]) }
2087
2111
  def streaming_http_headers; end
2088
2112
 
data/zip_kit.gemspec CHANGED
@@ -7,7 +7,7 @@ Gem::Specification.new do |spec|
7
7
  spec.version = ZipKit::VERSION
8
8
  spec.authors = ["Julik Tarkhanov", "Noah Berman", "Dmitry Tymchuk", "David Bosveld", "Felix Bünemann"]
9
9
  spec.email = ["me@julik.nl"]
10
-
10
+ spec.license = "MIT"
11
11
  spec.summary = "Stream out ZIP files from Ruby. Successor to zip_tricks."
12
12
  spec.description = "Stream out ZIP files from Ruby. Successor to zip_tricks."
13
13
  spec.homepage = "https://github.com/julik/zip_kit"
@@ -23,9 +23,12 @@ Gem::Specification.new do |spec|
23
23
  spec.require_paths = ["lib"]
24
24
 
25
25
  spec.add_development_dependency "bundler"
26
- spec.add_development_dependency "rubyzip", "~> 1"
27
26
 
28
- spec.add_development_dependency "rack" # For tests where we spin up a server
27
+ # zip_kit does not use any runtime dependencies (besides zlib). However, for testing
28
+ # things quite a few things are used - and for a good reason.
29
+
30
+ spec.add_development_dependency "rubyzip", "~> 1" # We test our output with _another_ ZIP library, which is the way to go here
31
+ spec.add_development_dependency "rack" # For tests where we spin up a server. Both for streaming out and for testing reads over HTTP
29
32
  spec.add_development_dependency "rake", "~> 12.2"
30
33
  spec.add_development_dependency "rspec", "~> 3"
31
34
  spec.add_development_dependency "rspec-mocks", "~> 3.10", ">= 3.10.2" # ruby 3 compatibility
@@ -39,5 +42,6 @@ Gem::Specification.new do |spec|
39
42
  spec.add_development_dependency "puma"
40
43
  spec.add_development_dependency "actionpack", "~> 5" # For testing RailsStreaming against an actual Rails controller
41
44
  spec.add_development_dependency "nokogiri", "~> 1", ">= 1.13" # Rails 5 does by mistake use an older Nokogiri otherwise
45
+ spec.add_development_dependency "sinatra"
42
46
  spec.add_development_dependency "sord"
43
47
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: zip_kit
3
3
  version: !ruby/object:Gem::Version
4
- version: 6.2.0
4
+ version: 6.2.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Julik Tarkhanov
@@ -12,7 +12,7 @@ authors:
12
12
  autorequire:
13
13
  bindir: exe
14
14
  cert_chain: []
15
- date: 2024-03-11 00:00:00.000000000 Z
15
+ date: 2024-03-27 00:00:00.000000000 Z
16
16
  dependencies:
17
17
  - !ruby/object:Gem::Dependency
18
18
  name: bundler
@@ -250,6 +250,20 @@ dependencies:
250
250
  - - ">="
251
251
  - !ruby/object:Gem::Version
252
252
  version: '1.13'
253
+ - !ruby/object:Gem::Dependency
254
+ name: sinatra
255
+ requirement: !ruby/object:Gem::Requirement
256
+ requirements:
257
+ - - ">="
258
+ - !ruby/object:Gem::Version
259
+ version: '0'
260
+ type: :development
261
+ prerelease: false
262
+ version_requirements: !ruby/object:Gem::Requirement
263
+ requirements:
264
+ - - ">="
265
+ - !ruby/object:Gem::Version
266
+ version: '0'
253
267
  - !ruby/object:Gem::Dependency
254
268
  name: sord
255
269
  requirement: !ruby/object:Gem::Requirement
@@ -292,6 +306,7 @@ files:
292
306
  - examples/parallel_compression_with_block_deflate.rb
293
307
  - examples/rack_application.rb
294
308
  - examples/s3_upload.rb
309
+ - examples/sinatra_application.rb
295
310
  - lib/zip_kit.rb
296
311
  - lib/zip_kit/block_deflate.rb
297
312
  - lib/zip_kit/block_write.rb
@@ -324,7 +339,8 @@ files:
324
339
  - rbi/zip_kit.rbi
325
340
  - zip_kit.gemspec
326
341
  homepage: https://github.com/julik/zip_kit
327
- licenses: []
342
+ licenses:
343
+ - MIT
328
344
  metadata:
329
345
  allowed_push_host: https://rubygems.org
330
346
  post_install_message:
@@ -342,7 +358,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
342
358
  - !ruby/object:Gem::Version
343
359
  version: '0'
344
360
  requirements: []
345
- rubygems_version: 3.3.7
361
+ rubygems_version: 3.1.6
346
362
  signing_key:
347
363
  specification_version: 4
348
364
  summary: Stream out ZIP files from Ruby. Successor to zip_tricks.