zip_kit 6.0.1 → 6.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +14 -1
- data/CONTRIBUTING.md +2 -2
- data/README.md +36 -27
- data/Rakefile +10 -2
- data/lib/zip_kit/block_write.rb +4 -1
- data/lib/zip_kit/file_reader.rb +1 -1
- data/lib/zip_kit/output_enumerator.rb +33 -40
- data/lib/zip_kit/rack_tempfile_body.rb +3 -1
- data/lib/zip_kit/rails_streaming.rb +36 -10
- data/lib/zip_kit/remote_io.rb +2 -0
- data/lib/zip_kit/size_estimator.rb +3 -1
- data/lib/zip_kit/streamer/heuristic.rb +3 -8
- data/lib/zip_kit/streamer.rb +23 -26
- data/lib/zip_kit/version.rb +1 -1
- data/lib/zip_kit/write_shovel.rb +2 -2
- data/lib/zip_kit/zip_writer.rb +182 -106
- data/rbi/zip_kit.rbi +2181 -0
- data/zip_kit.gemspec +7 -3
- metadata +20 -5
- data/.codeclimate.yml +0 -7
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f1d33b58f4501d3ddbae7abcab3957fde0549abf734eae72ca1a7ce45601f479
|
4
|
+
data.tar.gz: e9126924e6fe75329237ba551a1a65218676c7d2b3757f4ad91e73eb0bce154e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 011e57f856ebe7f625b0bfa5eeb4a240c6c38f2b07ff0434b7e89805516b2d47b6d4230ac404203d00562586909c72d7ea225a7238d47af70fb99ed97b3d50bc
|
7
|
+
data.tar.gz: b68fbaae2e57314c47e7971aeef2150341ba80308c7dee1536718250f5cac01b4b387fe3f73cde5f84fb1ffcd93500343ec451d0fccf9f19b2c0c4a58e74aa2f
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1
|
+
## 6.2.1
|
2
|
+
|
3
|
+
* Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
|
4
|
+
* Make `BlockWrite` respond to `write` in addition to `<<`
|
5
|
+
|
6
|
+
## 6.2.0
|
7
|
+
|
8
|
+
* Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
|
9
|
+
|
10
|
+
## 6.1.0
|
11
|
+
|
12
|
+
* Add Sorbet `.rbi` for type hints and resolution. This should make developing with zip_kit more pleasant, and the library - more discoverable.
|
13
|
+
|
1
14
|
## 6.0.1
|
2
15
|
|
3
16
|
* Fix `require` for the `VERSION` constant, as Zeitwerk would try to resolve it in Rails context, bringing the entire module under its reloading.
|
@@ -141,7 +154,7 @@
|
|
141
154
|
## 4.4.2
|
142
155
|
|
143
156
|
* Add 2.4 to Travis rubies
|
144
|
-
* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/
|
157
|
+
* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
|
145
158
|
|
146
159
|
## 4.4.1
|
147
160
|
|
data/CONTRIBUTING.md
CHANGED
@@ -106,11 +106,11 @@ project:
|
|
106
106
|
|
107
107
|
```bash
|
108
108
|
# Clone your fork of the repo into the current directory
|
109
|
-
git clone git@github.com:
|
109
|
+
git clone git@github.com:julik/zip_kit.git
|
110
110
|
# Navigate to the newly cloned directory
|
111
111
|
cd zip_kit
|
112
112
|
# Assign the original repo to a remote called "upstream"
|
113
|
-
git remote add upstream git@github.com:
|
113
|
+
git remote add upstream git@github.com:julik/zip_kit.git
|
114
114
|
```
|
115
115
|
|
116
116
|
2. If you cloned a while ago, get the latest changes from upstream:
|
data/README.md
CHANGED
@@ -1,5 +1,8 @@
|
|
1
1
|
# zip_kit
|
2
2
|
|
3
|
+
[![Tests](https://github.com/julik/zip_kit/actions/workflows/ci.yml/badge.svg)](https://github.com/julik/zip_kit/actions/workflows/ci.yml)
|
4
|
+
[![Gem Version](https://badge.fury.io/rb/zip_kit.svg)](https://badge.fury.io/rb/zip_kit)
|
5
|
+
|
3
6
|
Allows streaming, non-rewinding ZIP file output from Ruby.
|
4
7
|
|
5
8
|
`zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
|
@@ -56,7 +59,7 @@ via HTTP.
|
|
56
59
|
and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
|
57
60
|
RSpec) should work normally with controller actions returning ZIPs.
|
58
61
|
|
59
|
-
## Writing into
|
62
|
+
## Writing into streaming destinations
|
60
63
|
|
61
64
|
Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
|
62
65
|
is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
|
@@ -66,25 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
|
|
66
69
|
obj = bucket.object("big.zip")
|
67
70
|
obj.upload_stream do |write_stream|
|
68
71
|
ZipKit::Streamer.open(write_stream) do |zip|
|
69
|
-
zip.write_file("
|
70
|
-
|
71
|
-
|
72
|
-
20_000.times do |n|
|
73
|
-
csv << [n, "Item number #{n}"]
|
74
|
-
end
|
72
|
+
zip.write_file("file.csv") do |sink|
|
73
|
+
File.open("large.csv", "rb") do |file_input|
|
74
|
+
IO.copy_stream(file_input, sink)
|
75
75
|
end
|
76
76
|
end
|
77
77
|
end
|
78
78
|
end
|
79
79
|
```
|
80
80
|
|
81
|
-
|
81
|
+
## Writing through streaming wrappers
|
82
82
|
|
83
83
|
Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
|
84
|
-
output with [builder](https://github.com/jimweirich/builder#project-builder)
|
84
|
+
output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
|
85
|
+
every time a complete write call is done:
|
85
86
|
|
86
87
|
```ruby
|
87
|
-
zip.write_file('
|
88
|
+
zip.write_file('employees.xml') do |sink|
|
88
89
|
builder = Builder::XmlMarkup.new(target: sink, indent: 2)
|
89
90
|
builder.people do
|
90
91
|
Person.all.find_each do |person|
|
@@ -94,18 +95,28 @@ zip.write_file('report1.csv') do |sink|
|
|
94
95
|
end
|
95
96
|
```
|
96
97
|
|
97
|
-
|
98
|
-
|
98
|
+
The output will be compressed and output into the ZIP file on the fly. Same for CSV:
|
99
|
+
|
100
|
+
```ruby
|
101
|
+
zip.write_file('line_items.csv') do |sink|
|
102
|
+
CSV(sink) do |csv|
|
103
|
+
csv << ["Line", "Item"]
|
104
|
+
20_000.times do |n|
|
105
|
+
csv << [n, "Item number #{n}"]
|
106
|
+
end
|
107
|
+
end
|
108
|
+
end
|
109
|
+
```
|
99
110
|
|
100
111
|
## Create a ZIP file without size estimation, compress on-the-fly during writes
|
101
112
|
|
102
113
|
Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
|
103
114
|
memory inflation is going to be very constrained. Data will be written to destination at fairly regular
|
104
|
-
intervals. Deflate compression will work best for things like text files.
|
115
|
+
intervals. Deflate compression will work best for things like text files. For example, here is how to
|
116
|
+
output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
|
105
117
|
|
106
118
|
```ruby
|
107
|
-
|
108
|
-
ZipKit::Streamer.open(out) do |zip|
|
119
|
+
ZipKit::Streamer.open($stdout) do |zip|
|
109
120
|
zip.write_file('mov.mp4.txt') do |sink|
|
110
121
|
File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
|
111
122
|
end
|
@@ -114,17 +125,17 @@ ZipKit::Streamer.open(out) do |zip|
|
|
114
125
|
end
|
115
126
|
end
|
116
127
|
```
|
128
|
+
|
117
129
|
Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
|
118
130
|
since you do not know how large the compressed data segments are going to be.
|
119
131
|
|
120
132
|
## Send a ZIP from a Rack response
|
121
133
|
|
122
134
|
zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
|
123
|
-
by piece, and apply some amount of buffering as well.
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
|
135
|
+
by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
|
136
|
+
and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
|
137
|
+
the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
|
138
|
+
over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
|
128
139
|
|
129
140
|
```ruby
|
130
141
|
body = ZipKit::OutputEnumerator.new do | zip |
|
@@ -136,8 +147,7 @@ body = ZipKit::OutputEnumerator.new do | zip |
|
|
136
147
|
end
|
137
148
|
end
|
138
149
|
|
139
|
-
|
140
|
-
[200, headers, streaming_body]
|
150
|
+
[200, body.streaming_http_headers, body]
|
141
151
|
```
|
142
152
|
|
143
153
|
## Send a ZIP file of known size, with correct headers
|
@@ -145,13 +155,11 @@ headers, streaming_body = body.to_rack_response_headers_and_body(env)
|
|
145
155
|
Use the `SizeEstimator` to compute the correct size of the resulting archive.
|
146
156
|
|
147
157
|
```ruby
|
148
|
-
# Precompute the Content-Length ahead of time
|
149
158
|
bytesize = ZipKit::SizeEstimator.estimate do |z|
|
150
159
|
z.add_stored_entry(filename: 'myfile1.bin', size: 9090821)
|
151
160
|
z.add_stored_entry(filename: 'myfile2.bin', size: 458678)
|
152
161
|
end
|
153
162
|
|
154
|
-
# Prepare the response body. The block will only be called when the response starts to be written.
|
155
163
|
zip_body = ZipKit::OutputEnumerator.new do | zip |
|
156
164
|
zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
|
157
165
|
zip << read_file('myfile1.bin')
|
@@ -159,8 +167,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
|
|
159
167
|
zip << read_file('myfile2.bin')
|
160
168
|
end
|
161
169
|
|
162
|
-
|
163
|
-
[
|
170
|
+
hh = zip_body.streaming_http_headers
|
171
|
+
hh["Content-Length"] = bytesize.to_s
|
172
|
+
|
173
|
+
[200, hh, zip_body]
|
164
174
|
```
|
165
175
|
|
166
176
|
## Writing ZIP files using the Streamer bypass
|
@@ -171,7 +181,6 @@ the metadata of the file upfront (the CRC32 of the uncompressed file and the siz
|
|
171
181
|
to that socket using some accelerated writing technique, and only use the Streamer to write out the ZIP metadata.
|
172
182
|
|
173
183
|
```ruby
|
174
|
-
# io has to be an object that supports #<< or #write()
|
175
184
|
ZipKit::Streamer.open(io) do | zip |
|
176
185
|
# raw_file is written "as is" (STORED mode).
|
177
186
|
# Write the local file header first..
|
data/Rakefile
CHANGED
@@ -16,6 +16,14 @@ YARD::Rake::YardocTask.new(:doc) do |t|
|
|
16
16
|
# miscellaneous documentation files that contain no code
|
17
17
|
t.files = ["lib/**/*.rb", "-", "LICENSE.txt", "IMPLEMENTATION_DETAILS.md"]
|
18
18
|
end
|
19
|
-
|
20
19
|
RSpec::Core::RakeTask.new(:spec)
|
21
|
-
|
20
|
+
|
21
|
+
task :generate_typedefs do
|
22
|
+
`bundle exec sord rbi/zip_kit.rbi`
|
23
|
+
end
|
24
|
+
|
25
|
+
task default: [:spec, :standard, :generate_typedefs]
|
26
|
+
|
27
|
+
# When building the gem, generate typedefs beforehand,
|
28
|
+
# so that they get included
|
29
|
+
Rake::Task["build"].enhance(["generate_typedefs"])
|
data/lib/zip_kit/block_write.rb
CHANGED
@@ -17,9 +17,12 @@
|
|
17
17
|
# end
|
18
18
|
# [200, {}, MyRackResponse.new]
|
19
19
|
class ZipKit::BlockWrite
|
20
|
+
include ZipKit::WriteShovel
|
21
|
+
|
20
22
|
# Creates a new BlockWrite.
|
21
23
|
#
|
22
24
|
# @param block The block that will be called when this object receives the `<<` message
|
25
|
+
# @yieldparam bytes[String] A string in binary encoding which has just been written into the object
|
23
26
|
def initialize(&block)
|
24
27
|
@block = block
|
25
28
|
end
|
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
|
|
36
39
|
# @param buf[String] the string to write. Note that a zero-length String
|
37
40
|
# will not be forwarded to the block, as it has special meaning when used
|
38
41
|
# with chunked encoding (it indicates the end of the stream).
|
39
|
-
# @return
|
42
|
+
# @return [ZipKit::BlockWrite]
|
40
43
|
def <<(buf)
|
41
44
|
# Zero-size output has a special meaning when using chunked encoding
|
42
45
|
return if buf.nil? || buf.bytesize.zero?
|
data/lib/zip_kit/file_reader.rb
CHANGED
@@ -137,7 +137,7 @@ class ZipKit::FileReader
|
|
137
137
|
# reader = entry.extractor_from(source_file)
|
138
138
|
# outfile << reader.extract(512 * 1024) until reader.eof?
|
139
139
|
#
|
140
|
-
# @return [
|
140
|
+
# @return [StoredReader,InflatingReader] the reader for the data
|
141
141
|
def extractor_from(from_io)
|
142
142
|
from_io.seek(compressed_data_offset, IO::SEEK_SET)
|
143
143
|
case storage_mode
|
@@ -28,15 +28,11 @@ require "time" # for .httpdate
|
|
28
28
|
# end
|
29
29
|
# end
|
30
30
|
#
|
31
|
-
#
|
32
|
-
# which will give you true streaming capability:
|
31
|
+
# You can grab the headers one usually needs for streaming from `#streaming_http_headers`:
|
33
32
|
#
|
34
|
-
#
|
35
|
-
# [200, headers, chunked_or_presized_rack_body]
|
33
|
+
# [200, iterable_zip_body.streaming_http_headers, iterable_zip_body]
|
36
34
|
#
|
37
|
-
#
|
38
|
-
# benefits if your webserver does not support anything beyound HTTP/1.0, and also engages automatically
|
39
|
-
# in unit tests (since rack-test and Rails tests do not do streaming HTTP/1.1).
|
35
|
+
# to bypass things like `Rack::ETag` and the nginx buffering.
|
40
36
|
class ZipKit::OutputEnumerator
|
41
37
|
DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
|
42
38
|
|
@@ -64,14 +60,11 @@ class ZipKit::OutputEnumerator
|
|
64
60
|
# ...
|
65
61
|
# end
|
66
62
|
#
|
67
|
-
# @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
|
68
|
-
# @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
|
69
|
-
#
|
70
63
|
# @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
|
71
64
|
# @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
|
72
65
|
# it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
|
73
66
|
# object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
|
74
|
-
# but at block size boundaries or greater). Set
|
67
|
+
# but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
|
75
68
|
# @param blk a block that will receive the Streamer object when executing. The block will not be executed
|
76
69
|
# immediately but only once `each` is called on the OutputEnumerator
|
77
70
|
def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
|
@@ -103,17 +96,16 @@ class ZipKit::OutputEnumerator
|
|
103
96
|
end
|
104
97
|
end
|
105
98
|
|
106
|
-
# Returns a
|
107
|
-
#
|
108
|
-
#
|
109
|
-
#
|
110
|
-
#
|
99
|
+
# Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
|
100
|
+
# This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
|
101
|
+
# particular response body getting served. You might want to override the headers with your particular
|
102
|
+
# ones - for example, specific content types are needed for files which are, technically, ZIP files
|
103
|
+
# but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
|
104
|
+
# and ePubs.
|
111
105
|
#
|
112
|
-
# @
|
113
|
-
|
114
|
-
|
115
|
-
def to_headers_and_rack_response_body(rack_env, content_length: nil)
|
116
|
-
headers = {
|
106
|
+
# @return [Hash]
|
107
|
+
def self.streaming_http_headers
|
108
|
+
_headers = {
|
117
109
|
# We need to ensure Rack::ETag does not suddenly start buffering us, see
|
118
110
|
# https://github.com/rack/rack/issues/1619#issuecomment-606315714
|
119
111
|
# Set this even when not streaming for consistency. The fact that there would be
|
@@ -124,27 +116,28 @@ class ZipKit::OutputEnumerator
|
|
124
116
|
"Content-Encoding" => "identity",
|
125
117
|
# Disable buffering for both nginx and Google Load Balancer, see
|
126
118
|
# https://cloud.google.com/appengine/docs/flexible/how-requests-are-handled?tab=python#x-accel-buffering
|
127
|
-
"X-Accel-Buffering" => "no"
|
119
|
+
"X-Accel-Buffering" => "no",
|
120
|
+
# Set the correct content type. This should be overridden if you need to
|
121
|
+
# serve things such as EPubs and other derived ZIP formats.
|
122
|
+
"Content-Type" => "application/zip"
|
128
123
|
}
|
124
|
+
end
|
129
125
|
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
# since HTTP 1.0 does not support chunked responses we need to revert to buffering. The issue though is that
|
139
|
-
# this reversion happens silently and it is usually not clear at all why streaming does not work. So let's at
|
140
|
-
# the very least print it to the Rails log.
|
141
|
-
body = ZipKit::RackTempfileBody.new(rack_env, self)
|
142
|
-
headers["Content-Length"] = body.size.to_s
|
143
|
-
else
|
144
|
-
body = ZipKit::RackChunkedBody.new(self)
|
145
|
-
headers["Transfer-Encoding"] = "chunked"
|
146
|
-
end
|
126
|
+
# Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
|
127
|
+
# presized responses, but is now effectively a no-op.
|
128
|
+
#
|
129
|
+
# @see [ZipKit::OutputEnumerator.streaming_http_headers]
|
130
|
+
# @return [Hash]
|
131
|
+
def streaming_http_headers
|
132
|
+
self.class.streaming_http_headers
|
133
|
+
end
|
147
134
|
|
148
|
-
|
135
|
+
# Returns a tuple of `headers, body` - headers are a `Hash` and the body is
|
136
|
+
# an object that can be used as a Rack response body. This method used to accept arguments
|
137
|
+
# but will now just ignore them.
|
138
|
+
#
|
139
|
+
# @return [Array]
|
140
|
+
def to_headers_and_rack_response_body(*, **)
|
141
|
+
[streaming_http_headers, self]
|
149
142
|
end
|
150
143
|
end
|
@@ -1,7 +1,9 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
# Contains a file handle which can be closed once the response finishes sending.
|
4
|
-
# It supports `to_path` so that `Rack::Sendfile` can intercept it
|
4
|
+
# It supports `to_path` so that `Rack::Sendfile` can intercept it.
|
5
|
+
# This class is deprecated and is going to be removed in zip_kit 7.x
|
6
|
+
# @api deprecated
|
5
7
|
class ZipKit::RackTempfileBody
|
6
8
|
TEMPFILE_NAME_PREFIX = "zip-tricks-tf-body-"
|
7
9
|
attr_reader :tempfile
|
@@ -7,15 +7,12 @@ module ZipKit::RailsStreaming
|
|
7
7
|
# the Rails response stream is going to be closed automatically.
|
8
8
|
# @param filename[String] name of the file for the Content-Disposition header
|
9
9
|
# @param type[String] the content type (MIME type) of the archive being output
|
10
|
+
# @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
|
10
11
|
# @param zip_streamer_options[Hash] options that will be passed to the Streamer.
|
11
12
|
# See {ZipKit::Streamer#initialize} for the full list of options.
|
12
|
-
# @
|
13
|
+
# @yieldparam [ZipKit::Streamer] the streamer that can be written to
|
13
14
|
# @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
|
14
|
-
def zip_kit_stream(filename: "download.zip", type: "application/zip", **zip_streamer_options, &zip_streaming_blk)
|
15
|
-
# The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
|
16
|
-
# first will also validate the Streamer options.
|
17
|
-
chunk_yielder = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
|
18
|
-
|
15
|
+
def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
|
19
16
|
# We want some common headers for file sending. Rails will also set
|
20
17
|
# self.sending_file = true for us when we call send_file_headers!
|
21
18
|
send_file_headers!(type: type, filename: filename)
|
@@ -28,10 +25,39 @@ module ZipKit::RailsStreaming
|
|
28
25
|
logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
|
29
26
|
end
|
30
27
|
|
31
|
-
headers
|
32
|
-
|
33
|
-
# Set the "particular" streaming headers
|
28
|
+
headers = ZipKit::OutputEnumerator.streaming_http_headers
|
34
29
|
response.headers.merge!(headers)
|
35
|
-
|
30
|
+
|
31
|
+
# The output enumerator yields chunks of bytes generated from the Streamer,
|
32
|
+
# with some buffering
|
33
|
+
output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
|
34
|
+
|
35
|
+
# Time for some branching, which mostly has to do with the 999 flavours of
|
36
|
+
# "how to make both Rails and Rack stream"
|
37
|
+
if self.class.ancestors.include?(ActionController::Live)
|
38
|
+
# If this controller includes Live it will not work correctly with a Rack
|
39
|
+
# response body assignment - we need to write into the Live output stream instead
|
40
|
+
begin
|
41
|
+
output_enum.each { |bytes| response.stream.write(bytes) }
|
42
|
+
ensure
|
43
|
+
response.stream.close
|
44
|
+
end
|
45
|
+
elsif use_chunked_transfer_encoding
|
46
|
+
# Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
|
47
|
+
# Rack::ContentLength is normally not in a Rails middleware stack, but it might get
|
48
|
+
# introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
|
49
|
+
# There is a recommendation to leave the chunked encoding to the app server, so that servers
|
50
|
+
# that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
|
51
|
+
# see https://github.com/julik/zip_kit/issues/7
|
52
|
+
# But it is not to be excluded that a user may need to force the chunked encoding to bypass
|
53
|
+
# some especially pesky Rack middleware that just would not cooperate. Those include
|
54
|
+
# Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
|
55
|
+
response.headers["Transfer-Encoding"] = "chunked"
|
56
|
+
self.response_body = ZipKit::RackChunkedBody.new(output_enum)
|
57
|
+
else
|
58
|
+
# Stream using a Rack body assigned to the ActionController response body, without
|
59
|
+
# doing explicit chunked encoding. See above for the reasoning.
|
60
|
+
self.response_body = output_enum
|
61
|
+
end
|
36
62
|
end
|
37
63
|
end
|
data/lib/zip_kit/remote_io.rb
CHANGED
@@ -6,6 +6,8 @@ class ZipKit::SizeEstimator
|
|
6
6
|
|
7
7
|
# Creates a new estimator with a Streamer object. Normally you should use
|
8
8
|
# `estimate` instead an not use this method directly.
|
9
|
+
#
|
10
|
+
# @param streamer[ZipKit::Streamer]
|
9
11
|
def initialize(streamer)
|
10
12
|
@streamer = streamer
|
11
13
|
end
|
@@ -22,7 +24,7 @@ class ZipKit::SizeEstimator
|
|
22
24
|
#
|
23
25
|
# @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
|
24
26
|
# @return [Integer] the size of the resulting archive, in bytes
|
25
|
-
# @
|
27
|
+
# @yieldparam [SizeEstimator] the estimator
|
26
28
|
def self.estimate(**kwargs_for_streamer_new)
|
27
29
|
streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
|
28
30
|
estimator = new(streamer)
|
@@ -1,5 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
|
+
require "zlib"
|
4
|
+
|
3
5
|
# Will be used to pick whether to store a file in the `stored` or
|
4
6
|
# `deflated` mode, by compressing the first N bytes of the file and
|
5
7
|
# comparing the stored and deflated data sizes. If deflate produces
|
@@ -10,9 +12,7 @@
|
|
10
12
|
# Heuristic will call either `write_stored_file` or `write_deflated_file`
|
11
13
|
# on the Streamer passed into it once it knows which compression
|
12
14
|
# method should be applied
|
13
|
-
class ZipKit::Streamer::Heuristic
|
14
|
-
include ZipKit::WriteShovel
|
15
|
-
|
15
|
+
class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
|
16
16
|
BYTES_WRITTEN_THRESHOLD = 128 * 1024
|
17
17
|
MINIMUM_VIABLE_COMPRESSION = 0.75
|
18
18
|
|
@@ -39,11 +39,6 @@ class ZipKit::Streamer::Heuristic
|
|
39
39
|
self
|
40
40
|
end
|
41
41
|
|
42
|
-
def write(bytes)
|
43
|
-
self << bytes
|
44
|
-
bytes.bytesize
|
45
|
-
end
|
46
|
-
|
47
42
|
def close
|
48
43
|
decide unless @winner
|
49
44
|
@winner.close
|
data/lib/zip_kit/streamer.rb
CHANGED
@@ -2,19 +2,19 @@
|
|
2
2
|
|
3
3
|
require "set"
|
4
4
|
|
5
|
-
# Is used to write
|
6
|
-
#
|
7
|
-
# of this object can be coupled directly to, say, a Rack output. The
|
8
|
-
# output can also be a String, Array or anything that responds to `<<`.
|
5
|
+
# Is used to write ZIP archives without having to read them back or to overwrite
|
6
|
+
# data. It outputs into any object that supports `<<` or `write`, namely:
|
9
7
|
#
|
10
|
-
#
|
11
|
-
#
|
8
|
+
# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
|
9
|
+
# for the `Streamer`.
|
12
10
|
#
|
13
|
-
#
|
14
|
-
#
|
11
|
+
# You can also combine output through the `Streamer` with direct output to the destination,
|
12
|
+
# all while preserving the correct offsets in the ZIP file structures. This allows usage
|
13
|
+
# of `sendfile()` or socket `splice()` calls for "through" proxying.
|
15
14
|
#
|
16
|
-
#
|
17
|
-
#
|
15
|
+
# If you want to avoid data descriptors - or write data bypassing the Streamer -
|
16
|
+
# you need to know the CRC32 (as a uint) and the filesize upfront,
|
17
|
+
# before the writing of the entry body starts.
|
18
18
|
#
|
19
19
|
# ## Using the Streamer with runtime compression
|
20
20
|
#
|
@@ -34,7 +34,7 @@ require "set"
|
|
34
34
|
# end
|
35
35
|
# end
|
36
36
|
#
|
37
|
-
# The central directory will be written automatically at the end of the block.
|
37
|
+
# The central directory will be written automatically at the end of the `open` block.
|
38
38
|
#
|
39
39
|
# ## Using the Streamer with entries of known size and having a known CRC32 checksum
|
40
40
|
#
|
@@ -169,7 +169,7 @@ class ZipKit::Streamer
|
|
169
169
|
# @param uncompressed_size [Integer] the size of the entry when uncompressed, in bytes
|
170
170
|
# @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
|
171
171
|
# @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor
|
172
|
-
# @param unix_permissions[
|
172
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
173
173
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
174
174
|
def add_deflated_entry(filename:, modification_time: Time.now.utc, compressed_size: 0, uncompressed_size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
|
175
175
|
add_file_and_write_local_header(filename: filename,
|
@@ -193,7 +193,7 @@ class ZipKit::Streamer
|
|
193
193
|
# @param size [Integer] the size of the file when uncompressed, in bytes
|
194
194
|
# @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
|
195
195
|
# @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor. When in use
|
196
|
-
# @param unix_permissions[
|
196
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
197
197
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
198
198
|
def add_stored_entry(filename:, modification_time: Time.now.utc, size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
|
199
199
|
add_file_and_write_local_header(filename: filename,
|
@@ -211,7 +211,7 @@ class ZipKit::Streamer
|
|
211
211
|
#
|
212
212
|
# @param dirname [String] the name of the directory in the archive
|
213
213
|
# @param modification_time [Time] the modification time of the directory in the archive
|
214
|
-
# @param unix_permissions[
|
214
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
215
215
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
216
216
|
def add_empty_directory(dirname:, modification_time: Time.now.utc, unix_permissions: nil)
|
217
217
|
add_file_and_write_local_header(filename: dirname.to_s + "/",
|
@@ -262,13 +262,12 @@ class ZipKit::Streamer
|
|
262
262
|
#
|
263
263
|
# @param filename[String] the name of the file in the archive
|
264
264
|
# @param modification_time [Time] the modification time of the file in the archive
|
265
|
-
# @param unix_permissions[
|
266
|
-
# @
|
267
|
-
# sink[#<<, #write]
|
265
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
266
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
268
267
|
# an object that the file contents must be written to.
|
269
268
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
270
269
|
# output (using `IO.copy_stream` is a good approach).
|
271
|
-
# @return [
|
270
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
272
271
|
def write_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
273
272
|
writable = ZipKit::Streamer::Heuristic.new(self, filename, modification_time: modification_time, unix_permissions: unix_permissions)
|
274
273
|
yield_or_return_writable(writable, &blk)
|
@@ -313,13 +312,12 @@ class ZipKit::Streamer
|
|
313
312
|
#
|
314
313
|
# @param filename[String] the name of the file in the archive
|
315
314
|
# @param modification_time [Time] the modification time of the file in the archive
|
316
|
-
# @param unix_permissions[
|
317
|
-
# @
|
318
|
-
# sink[#<<, #write]
|
315
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
316
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
319
317
|
# an object that the file contents must be written to.
|
320
318
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
321
319
|
# output (using `IO.copy_stream` is a good approach).
|
322
|
-
# @return [
|
320
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
323
321
|
def write_stored_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
324
322
|
add_stored_entry(filename: filename,
|
325
323
|
modification_time: modification_time,
|
@@ -373,13 +371,12 @@ class ZipKit::Streamer
|
|
373
371
|
#
|
374
372
|
# @param filename[String] the name of the file in the archive
|
375
373
|
# @param modification_time [Time] the modification time of the file in the archive
|
376
|
-
# @param unix_permissions[
|
377
|
-
# @
|
378
|
-
# sink[#<<, #write]
|
374
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
375
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
379
376
|
# an object that the file contents must be written to.
|
380
377
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
381
378
|
# output (using `IO.copy_stream` is a good approach).
|
382
|
-
# @return [
|
379
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
383
380
|
def write_deflated_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
384
381
|
add_deflated_entry(filename: filename,
|
385
382
|
modification_time: modification_time,
|
data/lib/zip_kit/version.rb
CHANGED
data/lib/zip_kit/write_shovel.rb
CHANGED
@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
|
|
13
13
|
# Writes the given data to the output stream. Allows the object to be used as
|
14
14
|
# a target for `IO.copy_stream(from, to)`
|
15
15
|
#
|
16
|
-
# @param
|
17
|
-
# @return [Fixnum] the number of bytes written
|
16
|
+
# @param bytes[String] the binary string to write (part of the uncompressed file)
|
17
|
+
# @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
|
18
18
|
def write(bytes)
|
19
19
|
self << bytes
|
20
20
|
bytes.bytesize
|