zip_kit 6.0.1 → 6.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +14 -1
- data/CONTRIBUTING.md +2 -2
- data/README.md +36 -27
- data/Rakefile +10 -2
- data/lib/zip_kit/block_write.rb +4 -1
- data/lib/zip_kit/file_reader.rb +1 -1
- data/lib/zip_kit/output_enumerator.rb +33 -40
- data/lib/zip_kit/rack_tempfile_body.rb +3 -1
- data/lib/zip_kit/rails_streaming.rb +36 -10
- data/lib/zip_kit/remote_io.rb +2 -0
- data/lib/zip_kit/size_estimator.rb +3 -1
- data/lib/zip_kit/streamer/heuristic.rb +3 -8
- data/lib/zip_kit/streamer.rb +23 -26
- data/lib/zip_kit/version.rb +1 -1
- data/lib/zip_kit/write_shovel.rb +2 -2
- data/lib/zip_kit/zip_writer.rb +182 -106
- data/rbi/zip_kit.rbi +2181 -0
- data/zip_kit.gemspec +7 -3
- metadata +20 -5
- data/.codeclimate.yml +0 -7
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f1d33b58f4501d3ddbae7abcab3957fde0549abf734eae72ca1a7ce45601f479
|
4
|
+
data.tar.gz: e9126924e6fe75329237ba551a1a65218676c7d2b3757f4ad91e73eb0bce154e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 011e57f856ebe7f625b0bfa5eeb4a240c6c38f2b07ff0434b7e89805516b2d47b6d4230ac404203d00562586909c72d7ea225a7238d47af70fb99ed97b3d50bc
|
7
|
+
data.tar.gz: b68fbaae2e57314c47e7971aeef2150341ba80308c7dee1536718250f5cac01b4b387fe3f73cde5f84fb1ffcd93500343ec451d0fccf9f19b2c0c4a58e74aa2f
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1
|
+
## 6.2.1
|
2
|
+
|
3
|
+
* Make `RailsStreaming` compatible with `ActionController::Live` (previously the response would hang)
|
4
|
+
* Make `BlockWrite` respond to `write` in addition to `<<`
|
5
|
+
|
6
|
+
## 6.2.0
|
7
|
+
|
8
|
+
* Remove forced `Transfer-Encoding: chunked` and the chunking body wrapper. It is actually a good idea to trust the app webserver to apply the transfer encoding as is appropriate. For the case when "you really have to", add a bypass in `RailsStreaming#zip_kit_stream` for forcing the chunking manually.
|
9
|
+
|
10
|
+
## 6.1.0
|
11
|
+
|
12
|
+
* Add Sorbet `.rbi` for type hints and resolution. This should make developing with zip_kit more pleasant, and the library - more discoverable.
|
13
|
+
|
1
14
|
## 6.0.1
|
2
15
|
|
3
16
|
* Fix `require` for the `VERSION` constant, as Zeitwerk would try to resolve it in Rails context, bringing the entire module under its reloading.
|
@@ -141,7 +154,7 @@
|
|
141
154
|
## 4.4.2
|
142
155
|
|
143
156
|
* Add 2.4 to Travis rubies
|
144
|
-
* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/
|
157
|
+
* Fix a severe performance degradation in Streamer with large file counts (https://github.com/WeTransfer/zip_tricks/pull/14)
|
145
158
|
|
146
159
|
## 4.4.1
|
147
160
|
|
data/CONTRIBUTING.md
CHANGED
@@ -106,11 +106,11 @@ project:
|
|
106
106
|
|
107
107
|
```bash
|
108
108
|
# Clone your fork of the repo into the current directory
|
109
|
-
git clone git@github.com:
|
109
|
+
git clone git@github.com:julik/zip_kit.git
|
110
110
|
# Navigate to the newly cloned directory
|
111
111
|
cd zip_kit
|
112
112
|
# Assign the original repo to a remote called "upstream"
|
113
|
-
git remote add upstream git@github.com:
|
113
|
+
git remote add upstream git@github.com:julik/zip_kit.git
|
114
114
|
```
|
115
115
|
|
116
116
|
2. If you cloned a while ago, get the latest changes from upstream:
|
data/README.md
CHANGED
@@ -1,5 +1,8 @@
|
|
1
1
|
# zip_kit
|
2
2
|
|
3
|
+
[](https://github.com/julik/zip_kit/actions/workflows/ci.yml)
|
4
|
+
[](https://badge.fury.io/rb/zip_kit)
|
5
|
+
|
3
6
|
Allows streaming, non-rewinding ZIP file output from Ruby.
|
4
7
|
|
5
8
|
`zip_kit` is a successor to and continuation of [zip_tricks](https://github.com/WeTransfer/zip_tricks), which
|
@@ -56,7 +59,7 @@ via HTTP.
|
|
56
59
|
and the ZIP output will run in the same thread as your main request. Your testing flows (be it minitest or
|
57
60
|
RSpec) should work normally with controller actions returning ZIPs.
|
58
61
|
|
59
|
-
## Writing into
|
62
|
+
## Writing into streaming destinations
|
60
63
|
|
61
64
|
Any object that accepts bytes via either `<<` or `write` methods can be a write destination. For example, here
|
62
65
|
is how to upload a sizeable ZIP to S3 - the SDK will happily chop your upload into multipart upload parts:
|
@@ -66,25 +69,23 @@ bucket = Aws::S3::Bucket.new("mybucket")
|
|
66
69
|
obj = bucket.object("big.zip")
|
67
70
|
obj.upload_stream do |write_stream|
|
68
71
|
ZipKit::Streamer.open(write_stream) do |zip|
|
69
|
-
zip.write_file("
|
70
|
-
|
71
|
-
|
72
|
-
20_000.times do |n|
|
73
|
-
csv << [n, "Item number #{n}"]
|
74
|
-
end
|
72
|
+
zip.write_file("file.csv") do |sink|
|
73
|
+
File.open("large.csv", "rb") do |file_input|
|
74
|
+
IO.copy_stream(file_input, sink)
|
75
75
|
end
|
76
76
|
end
|
77
77
|
end
|
78
78
|
end
|
79
79
|
```
|
80
80
|
|
81
|
-
|
81
|
+
## Writing through streaming wrappers
|
82
82
|
|
83
83
|
Any object that writes using either `<<` or `write` can write into a `sink`. For example, you can do streaming
|
84
|
-
output with [builder](https://github.com/jimweirich/builder#project-builder)
|
84
|
+
output with [builder](https://github.com/jimweirich/builder#project-builder) which calls `<<` on its `target`
|
85
|
+
every time a complete write call is done:
|
85
86
|
|
86
87
|
```ruby
|
87
|
-
zip.write_file('
|
88
|
+
zip.write_file('employees.xml') do |sink|
|
88
89
|
builder = Builder::XmlMarkup.new(target: sink, indent: 2)
|
89
90
|
builder.people do
|
90
91
|
Person.all.find_each do |person|
|
@@ -94,18 +95,28 @@ zip.write_file('report1.csv') do |sink|
|
|
94
95
|
end
|
95
96
|
```
|
96
97
|
|
97
|
-
|
98
|
-
|
98
|
+
The output will be compressed and output into the ZIP file on the fly. Same for CSV:
|
99
|
+
|
100
|
+
```ruby
|
101
|
+
zip.write_file('line_items.csv') do |sink|
|
102
|
+
CSV(sink) do |csv|
|
103
|
+
csv << ["Line", "Item"]
|
104
|
+
20_000.times do |n|
|
105
|
+
csv << [n, "Item number #{n}"]
|
106
|
+
end
|
107
|
+
end
|
108
|
+
end
|
109
|
+
```
|
99
110
|
|
100
111
|
## Create a ZIP file without size estimation, compress on-the-fly during writes
|
101
112
|
|
102
113
|
Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
|
103
114
|
memory inflation is going to be very constrained. Data will be written to destination at fairly regular
|
104
|
-
intervals. Deflate compression will work best for things like text files.
|
115
|
+
intervals. Deflate compression will work best for things like text files. For example, here is how to
|
116
|
+
output direct to STDOUT (so that you can run `$ ruby archive.rb > file.zip` in your terminal):
|
105
117
|
|
106
118
|
```ruby
|
107
|
-
|
108
|
-
ZipKit::Streamer.open(out) do |zip|
|
119
|
+
ZipKit::Streamer.open($stdout) do |zip|
|
109
120
|
zip.write_file('mov.mp4.txt') do |sink|
|
110
121
|
File.open('mov.mp4', 'rb'){|source| IO.copy_stream(source, sink) }
|
111
122
|
end
|
@@ -114,17 +125,17 @@ ZipKit::Streamer.open(out) do |zip|
|
|
114
125
|
end
|
115
126
|
end
|
116
127
|
```
|
128
|
+
|
117
129
|
Unfortunately with this approach it is impossible to compute the size of the ZIP file being output,
|
118
130
|
since you do not know how large the compressed data segments are going to be.
|
119
131
|
|
120
132
|
## Send a ZIP from a Rack response
|
121
133
|
|
122
134
|
zip_kit provides an `OutputEnumerator` object which will yield the binary chunks piece
|
123
|
-
by piece, and apply some amount of buffering as well.
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
|
135
|
+
by piece, and apply some amount of buffering as well. Return the headers and the body to your webserver
|
136
|
+
and you will have your ZIP streamed! The block that you give to the `OutputEnumerator` will receive
|
137
|
+
the {ZipKit::Streamer} object and will only start executing once your response body starts getting iterated
|
138
|
+
over - when actually sending the response to the client (unless you are using a buffering Rack webserver, such as Webrick).
|
128
139
|
|
129
140
|
```ruby
|
130
141
|
body = ZipKit::OutputEnumerator.new do | zip |
|
@@ -136,8 +147,7 @@ body = ZipKit::OutputEnumerator.new do | zip |
|
|
136
147
|
end
|
137
148
|
end
|
138
149
|
|
139
|
-
|
140
|
-
[200, headers, streaming_body]
|
150
|
+
[200, body.streaming_http_headers, body]
|
141
151
|
```
|
142
152
|
|
143
153
|
## Send a ZIP file of known size, with correct headers
|
@@ -145,13 +155,11 @@ headers, streaming_body = body.to_rack_response_headers_and_body(env)
|
|
145
155
|
Use the `SizeEstimator` to compute the correct size of the resulting archive.
|
146
156
|
|
147
157
|
```ruby
|
148
|
-
# Precompute the Content-Length ahead of time
|
149
158
|
bytesize = ZipKit::SizeEstimator.estimate do |z|
|
150
159
|
z.add_stored_entry(filename: 'myfile1.bin', size: 9090821)
|
151
160
|
z.add_stored_entry(filename: 'myfile2.bin', size: 458678)
|
152
161
|
end
|
153
162
|
|
154
|
-
# Prepare the response body. The block will only be called when the response starts to be written.
|
155
163
|
zip_body = ZipKit::OutputEnumerator.new do | zip |
|
156
164
|
zip.add_stored_entry(filename: "myfile1.bin", size: 9090821, crc32: 12485)
|
157
165
|
zip << read_file('myfile1.bin')
|
@@ -159,8 +167,10 @@ zip_body = ZipKit::OutputEnumerator.new do | zip |
|
|
159
167
|
zip << read_file('myfile2.bin')
|
160
168
|
end
|
161
169
|
|
162
|
-
|
163
|
-
[
|
170
|
+
hh = zip_body.streaming_http_headers
|
171
|
+
hh["Content-Length"] = bytesize.to_s
|
172
|
+
|
173
|
+
[200, hh, zip_body]
|
164
174
|
```
|
165
175
|
|
166
176
|
## Writing ZIP files using the Streamer bypass
|
@@ -171,7 +181,6 @@ the metadata of the file upfront (the CRC32 of the uncompressed file and the siz
|
|
171
181
|
to that socket using some accelerated writing technique, and only use the Streamer to write out the ZIP metadata.
|
172
182
|
|
173
183
|
```ruby
|
174
|
-
# io has to be an object that supports #<< or #write()
|
175
184
|
ZipKit::Streamer.open(io) do | zip |
|
176
185
|
# raw_file is written "as is" (STORED mode).
|
177
186
|
# Write the local file header first..
|
data/Rakefile
CHANGED
@@ -16,6 +16,14 @@ YARD::Rake::YardocTask.new(:doc) do |t|
|
|
16
16
|
# miscellaneous documentation files that contain no code
|
17
17
|
t.files = ["lib/**/*.rb", "-", "LICENSE.txt", "IMPLEMENTATION_DETAILS.md"]
|
18
18
|
end
|
19
|
-
|
20
19
|
RSpec::Core::RakeTask.new(:spec)
|
21
|
-
|
20
|
+
|
21
|
+
task :generate_typedefs do
|
22
|
+
`bundle exec sord rbi/zip_kit.rbi`
|
23
|
+
end
|
24
|
+
|
25
|
+
task default: [:spec, :standard, :generate_typedefs]
|
26
|
+
|
27
|
+
# When building the gem, generate typedefs beforehand,
|
28
|
+
# so that they get included
|
29
|
+
Rake::Task["build"].enhance(["generate_typedefs"])
|
data/lib/zip_kit/block_write.rb
CHANGED
@@ -17,9 +17,12 @@
|
|
17
17
|
# end
|
18
18
|
# [200, {}, MyRackResponse.new]
|
19
19
|
class ZipKit::BlockWrite
|
20
|
+
include ZipKit::WriteShovel
|
21
|
+
|
20
22
|
# Creates a new BlockWrite.
|
21
23
|
#
|
22
24
|
# @param block The block that will be called when this object receives the `<<` message
|
25
|
+
# @yieldparam bytes[String] A string in binary encoding which has just been written into the object
|
23
26
|
def initialize(&block)
|
24
27
|
@block = block
|
25
28
|
end
|
@@ -36,7 +39,7 @@ class ZipKit::BlockWrite
|
|
36
39
|
# @param buf[String] the string to write. Note that a zero-length String
|
37
40
|
# will not be forwarded to the block, as it has special meaning when used
|
38
41
|
# with chunked encoding (it indicates the end of the stream).
|
39
|
-
# @return
|
42
|
+
# @return [ZipKit::BlockWrite]
|
40
43
|
def <<(buf)
|
41
44
|
# Zero-size output has a special meaning when using chunked encoding
|
42
45
|
return if buf.nil? || buf.bytesize.zero?
|
data/lib/zip_kit/file_reader.rb
CHANGED
@@ -137,7 +137,7 @@ class ZipKit::FileReader
|
|
137
137
|
# reader = entry.extractor_from(source_file)
|
138
138
|
# outfile << reader.extract(512 * 1024) until reader.eof?
|
139
139
|
#
|
140
|
-
# @return [
|
140
|
+
# @return [StoredReader,InflatingReader] the reader for the data
|
141
141
|
def extractor_from(from_io)
|
142
142
|
from_io.seek(compressed_data_offset, IO::SEEK_SET)
|
143
143
|
case storage_mode
|
@@ -28,15 +28,11 @@ require "time" # for .httpdate
|
|
28
28
|
# end
|
29
29
|
# end
|
30
30
|
#
|
31
|
-
#
|
32
|
-
# which will give you true streaming capability:
|
31
|
+
# You can grab the headers one usually needs for streaming from `#streaming_http_headers`:
|
33
32
|
#
|
34
|
-
#
|
35
|
-
# [200, headers, chunked_or_presized_rack_body]
|
33
|
+
# [200, iterable_zip_body.streaming_http_headers, iterable_zip_body]
|
36
34
|
#
|
37
|
-
#
|
38
|
-
# benefits if your webserver does not support anything beyound HTTP/1.0, and also engages automatically
|
39
|
-
# in unit tests (since rack-test and Rails tests do not do streaming HTTP/1.1).
|
35
|
+
# to bypass things like `Rack::ETag` and the nginx buffering.
|
40
36
|
class ZipKit::OutputEnumerator
|
41
37
|
DEFAULT_WRITE_BUFFER_SIZE = 64 * 1024
|
42
38
|
|
@@ -64,14 +60,11 @@ class ZipKit::OutputEnumerator
|
|
64
60
|
# ...
|
65
61
|
# end
|
66
62
|
#
|
67
|
-
# @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
|
68
|
-
# @return [ZipKit::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
|
69
|
-
#
|
70
63
|
# @param streamer_options[Hash] options for Streamer, see {ZipKit::Streamer.new}
|
71
64
|
# @param write_buffer_size[Integer] By default all ZipKit writes are unbuffered. For output to sockets
|
72
65
|
# it is beneficial to bulkify those writes so that they are roughly sized to a socket buffer chunk. This
|
73
66
|
# object will bulkify writes for you in this way (so `each` will yield not on every call to `<<` from the Streamer
|
74
|
-
# but at block size boundaries or greater). Set
|
67
|
+
# but at block size boundaries or greater). Set the parameter to 0 for unbuffered writes.
|
75
68
|
# @param blk a block that will receive the Streamer object when executing. The block will not be executed
|
76
69
|
# immediately but only once `each` is called on the OutputEnumerator
|
77
70
|
def initialize(write_buffer_size: DEFAULT_WRITE_BUFFER_SIZE, **streamer_options, &blk)
|
@@ -103,17 +96,16 @@ class ZipKit::OutputEnumerator
|
|
103
96
|
end
|
104
97
|
end
|
105
98
|
|
106
|
-
# Returns a
|
107
|
-
#
|
108
|
-
#
|
109
|
-
#
|
110
|
-
#
|
99
|
+
# Returns a Hash of HTTP response headers you are likely to need to have your response stream correctly.
|
100
|
+
# This is on the {ZipKit::OutputEnumerator} class since those headers are common, independent of the
|
101
|
+
# particular response body getting served. You might want to override the headers with your particular
|
102
|
+
# ones - for example, specific content types are needed for files which are, technically, ZIP files
|
103
|
+
# but are of a file format built "on top" of ZIPs - such as ODTs, [pkpass files](https://developer.apple.com/documentation/walletpasses/building_a_pass)
|
104
|
+
# and ePubs.
|
111
105
|
#
|
112
|
-
# @
|
113
|
-
|
114
|
-
|
115
|
-
def to_headers_and_rack_response_body(rack_env, content_length: nil)
|
116
|
-
headers = {
|
106
|
+
# @return [Hash]
|
107
|
+
def self.streaming_http_headers
|
108
|
+
_headers = {
|
117
109
|
# We need to ensure Rack::ETag does not suddenly start buffering us, see
|
118
110
|
# https://github.com/rack/rack/issues/1619#issuecomment-606315714
|
119
111
|
# Set this even when not streaming for consistency. The fact that there would be
|
@@ -124,27 +116,28 @@ class ZipKit::OutputEnumerator
|
|
124
116
|
"Content-Encoding" => "identity",
|
125
117
|
# Disable buffering for both nginx and Google Load Balancer, see
|
126
118
|
# https://cloud.google.com/appengine/docs/flexible/how-requests-are-handled?tab=python#x-accel-buffering
|
127
|
-
"X-Accel-Buffering" => "no"
|
119
|
+
"X-Accel-Buffering" => "no",
|
120
|
+
# Set the correct content type. This should be overridden if you need to
|
121
|
+
# serve things such as EPubs and other derived ZIP formats.
|
122
|
+
"Content-Type" => "application/zip"
|
128
123
|
}
|
124
|
+
end
|
129
125
|
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
# since HTTP 1.0 does not support chunked responses we need to revert to buffering. The issue though is that
|
139
|
-
# this reversion happens silently and it is usually not clear at all why streaming does not work. So let's at
|
140
|
-
# the very least print it to the Rails log.
|
141
|
-
body = ZipKit::RackTempfileBody.new(rack_env, self)
|
142
|
-
headers["Content-Length"] = body.size.to_s
|
143
|
-
else
|
144
|
-
body = ZipKit::RackChunkedBody.new(self)
|
145
|
-
headers["Transfer-Encoding"] = "chunked"
|
146
|
-
end
|
126
|
+
# Returns a Hash of HTTP response headers for this particular response. This used to contain "Content-Length" for
|
127
|
+
# presized responses, but is now effectively a no-op.
|
128
|
+
#
|
129
|
+
# @see [ZipKit::OutputEnumerator.streaming_http_headers]
|
130
|
+
# @return [Hash]
|
131
|
+
def streaming_http_headers
|
132
|
+
self.class.streaming_http_headers
|
133
|
+
end
|
147
134
|
|
148
|
-
|
135
|
+
# Returns a tuple of `headers, body` - headers are a `Hash` and the body is
|
136
|
+
# an object that can be used as a Rack response body. This method used to accept arguments
|
137
|
+
# but will now just ignore them.
|
138
|
+
#
|
139
|
+
# @return [Array]
|
140
|
+
def to_headers_and_rack_response_body(*, **)
|
141
|
+
[streaming_http_headers, self]
|
149
142
|
end
|
150
143
|
end
|
@@ -1,7 +1,9 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
# Contains a file handle which can be closed once the response finishes sending.
|
4
|
-
# It supports `to_path` so that `Rack::Sendfile` can intercept it
|
4
|
+
# It supports `to_path` so that `Rack::Sendfile` can intercept it.
|
5
|
+
# This class is deprecated and is going to be removed in zip_kit 7.x
|
6
|
+
# @api deprecated
|
5
7
|
class ZipKit::RackTempfileBody
|
6
8
|
TEMPFILE_NAME_PREFIX = "zip-tricks-tf-body-"
|
7
9
|
attr_reader :tempfile
|
@@ -7,15 +7,12 @@ module ZipKit::RailsStreaming
|
|
7
7
|
# the Rails response stream is going to be closed automatically.
|
8
8
|
# @param filename[String] name of the file for the Content-Disposition header
|
9
9
|
# @param type[String] the content type (MIME type) of the archive being output
|
10
|
+
# @param use_chunked_transfer_encoding[Boolean] whether to forcibly encode output as chunked. Normally you should not need this.
|
10
11
|
# @param zip_streamer_options[Hash] options that will be passed to the Streamer.
|
11
12
|
# See {ZipKit::Streamer#initialize} for the full list of options.
|
12
|
-
# @
|
13
|
+
# @yieldparam [ZipKit::Streamer] the streamer that can be written to
|
13
14
|
# @return [ZipKit::OutputEnumerator] The output enumerator assigned to the response body
|
14
|
-
def zip_kit_stream(filename: "download.zip", type: "application/zip", **zip_streamer_options, &zip_streaming_blk)
|
15
|
-
# The output enumerator yields chunks of bytes generated from ZipKit. Instantiating it
|
16
|
-
# first will also validate the Streamer options.
|
17
|
-
chunk_yielder = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
|
18
|
-
|
15
|
+
def zip_kit_stream(filename: "download.zip", type: "application/zip", use_chunked_transfer_encoding: false, **zip_streamer_options, &zip_streaming_blk)
|
19
16
|
# We want some common headers for file sending. Rails will also set
|
20
17
|
# self.sending_file = true for us when we call send_file_headers!
|
21
18
|
send_file_headers!(type: type, filename: filename)
|
@@ -28,10 +25,39 @@ module ZipKit::RailsStreaming
|
|
28
25
|
logger&.warn { "The downstream HTTP proxy/LB insists on HTTP/1.0 protocol, ZIP response will be buffered." }
|
29
26
|
end
|
30
27
|
|
31
|
-
headers
|
32
|
-
|
33
|
-
# Set the "particular" streaming headers
|
28
|
+
headers = ZipKit::OutputEnumerator.streaming_http_headers
|
34
29
|
response.headers.merge!(headers)
|
35
|
-
|
30
|
+
|
31
|
+
# The output enumerator yields chunks of bytes generated from the Streamer,
|
32
|
+
# with some buffering
|
33
|
+
output_enum = ZipKit::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
|
34
|
+
|
35
|
+
# Time for some branching, which mostly has to do with the 999 flavours of
|
36
|
+
# "how to make both Rails and Rack stream"
|
37
|
+
if self.class.ancestors.include?(ActionController::Live)
|
38
|
+
# If this controller includes Live it will not work correctly with a Rack
|
39
|
+
# response body assignment - we need to write into the Live output stream instead
|
40
|
+
begin
|
41
|
+
output_enum.each { |bytes| response.stream.write(bytes) }
|
42
|
+
ensure
|
43
|
+
response.stream.close
|
44
|
+
end
|
45
|
+
elsif use_chunked_transfer_encoding
|
46
|
+
# Chunked encoding may be forced if, for example, you _need_ to bypass Rack::ContentLength.
|
47
|
+
# Rack::ContentLength is normally not in a Rails middleware stack, but it might get
|
48
|
+
# introduced unintentionally - for example, "rackup" adds the ContentLength middleware for you.
|
49
|
+
# There is a recommendation to leave the chunked encoding to the app server, so that servers
|
50
|
+
# that support HTTP/2 can use native framing and not have to deal with the chunked encoding,
|
51
|
+
# see https://github.com/julik/zip_kit/issues/7
|
52
|
+
# But it is not to be excluded that a user may need to force the chunked encoding to bypass
|
53
|
+
# some especially pesky Rack middleware that just would not cooperate. Those include
|
54
|
+
# Rack::MiniProfiler and the above-mentioned Rack::ContentLength.
|
55
|
+
response.headers["Transfer-Encoding"] = "chunked"
|
56
|
+
self.response_body = ZipKit::RackChunkedBody.new(output_enum)
|
57
|
+
else
|
58
|
+
# Stream using a Rack body assigned to the ActionController response body, without
|
59
|
+
# doing explicit chunked encoding. See above for the reasoning.
|
60
|
+
self.response_body = output_enum
|
61
|
+
end
|
36
62
|
end
|
37
63
|
end
|
data/lib/zip_kit/remote_io.rb
CHANGED
@@ -6,6 +6,8 @@ class ZipKit::SizeEstimator
|
|
6
6
|
|
7
7
|
# Creates a new estimator with a Streamer object. Normally you should use
|
8
8
|
# `estimate` instead an not use this method directly.
|
9
|
+
#
|
10
|
+
# @param streamer[ZipKit::Streamer]
|
9
11
|
def initialize(streamer)
|
10
12
|
@streamer = streamer
|
11
13
|
end
|
@@ -22,7 +24,7 @@ class ZipKit::SizeEstimator
|
|
22
24
|
#
|
23
25
|
# @param kwargs_for_streamer_new Any options to pass to Streamer, see {Streamer#initialize}
|
24
26
|
# @return [Integer] the size of the resulting archive, in bytes
|
25
|
-
# @
|
27
|
+
# @yieldparam [SizeEstimator] the estimator
|
26
28
|
def self.estimate(**kwargs_for_streamer_new)
|
27
29
|
streamer = ZipKit::Streamer.new(ZipKit::NullWriter, **kwargs_for_streamer_new)
|
28
30
|
estimator = new(streamer)
|
@@ -1,5 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
|
+
require "zlib"
|
4
|
+
|
3
5
|
# Will be used to pick whether to store a file in the `stored` or
|
4
6
|
# `deflated` mode, by compressing the first N bytes of the file and
|
5
7
|
# comparing the stored and deflated data sizes. If deflate produces
|
@@ -10,9 +12,7 @@
|
|
10
12
|
# Heuristic will call either `write_stored_file` or `write_deflated_file`
|
11
13
|
# on the Streamer passed into it once it knows which compression
|
12
14
|
# method should be applied
|
13
|
-
class ZipKit::Streamer::Heuristic
|
14
|
-
include ZipKit::WriteShovel
|
15
|
-
|
15
|
+
class ZipKit::Streamer::Heuristic < ZipKit::Streamer::Writable
|
16
16
|
BYTES_WRITTEN_THRESHOLD = 128 * 1024
|
17
17
|
MINIMUM_VIABLE_COMPRESSION = 0.75
|
18
18
|
|
@@ -39,11 +39,6 @@ class ZipKit::Streamer::Heuristic
|
|
39
39
|
self
|
40
40
|
end
|
41
41
|
|
42
|
-
def write(bytes)
|
43
|
-
self << bytes
|
44
|
-
bytes.bytesize
|
45
|
-
end
|
46
|
-
|
47
42
|
def close
|
48
43
|
decide unless @winner
|
49
44
|
@winner.close
|
data/lib/zip_kit/streamer.rb
CHANGED
@@ -2,19 +2,19 @@
|
|
2
2
|
|
3
3
|
require "set"
|
4
4
|
|
5
|
-
# Is used to write
|
6
|
-
#
|
7
|
-
# of this object can be coupled directly to, say, a Rack output. The
|
8
|
-
# output can also be a String, Array or anything that responds to `<<`.
|
5
|
+
# Is used to write ZIP archives without having to read them back or to overwrite
|
6
|
+
# data. It outputs into any object that supports `<<` or `write`, namely:
|
9
7
|
#
|
10
|
-
#
|
11
|
-
#
|
8
|
+
# An `Array`, `File`, `IO`, `Socket` and even `String` all can be output destinations
|
9
|
+
# for the `Streamer`.
|
12
10
|
#
|
13
|
-
#
|
14
|
-
#
|
11
|
+
# You can also combine output through the `Streamer` with direct output to the destination,
|
12
|
+
# all while preserving the correct offsets in the ZIP file structures. This allows usage
|
13
|
+
# of `sendfile()` or socket `splice()` calls for "through" proxying.
|
15
14
|
#
|
16
|
-
#
|
17
|
-
#
|
15
|
+
# If you want to avoid data descriptors - or write data bypassing the Streamer -
|
16
|
+
# you need to know the CRC32 (as a uint) and the filesize upfront,
|
17
|
+
# before the writing of the entry body starts.
|
18
18
|
#
|
19
19
|
# ## Using the Streamer with runtime compression
|
20
20
|
#
|
@@ -34,7 +34,7 @@ require "set"
|
|
34
34
|
# end
|
35
35
|
# end
|
36
36
|
#
|
37
|
-
# The central directory will be written automatically at the end of the block.
|
37
|
+
# The central directory will be written automatically at the end of the `open` block.
|
38
38
|
#
|
39
39
|
# ## Using the Streamer with entries of known size and having a known CRC32 checksum
|
40
40
|
#
|
@@ -169,7 +169,7 @@ class ZipKit::Streamer
|
|
169
169
|
# @param uncompressed_size [Integer] the size of the entry when uncompressed, in bytes
|
170
170
|
# @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
|
171
171
|
# @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor
|
172
|
-
# @param unix_permissions[
|
172
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
173
173
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
174
174
|
def add_deflated_entry(filename:, modification_time: Time.now.utc, compressed_size: 0, uncompressed_size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
|
175
175
|
add_file_and_write_local_header(filename: filename,
|
@@ -193,7 +193,7 @@ class ZipKit::Streamer
|
|
193
193
|
# @param size [Integer] the size of the file when uncompressed, in bytes
|
194
194
|
# @param crc32 [Integer] the CRC32 checksum of the entry when uncompressed
|
195
195
|
# @param use_data_descriptor [Boolean] whether the entry body will be followed by a data descriptor. When in use
|
196
|
-
# @param unix_permissions[
|
196
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
197
197
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
198
198
|
def add_stored_entry(filename:, modification_time: Time.now.utc, size: 0, crc32: 0, unix_permissions: nil, use_data_descriptor: false)
|
199
199
|
add_file_and_write_local_header(filename: filename,
|
@@ -211,7 +211,7 @@ class ZipKit::Streamer
|
|
211
211
|
#
|
212
212
|
# @param dirname [String] the name of the directory in the archive
|
213
213
|
# @param modification_time [Time] the modification time of the directory in the archive
|
214
|
-
# @param unix_permissions[
|
214
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
215
215
|
# @return [Integer] the offset the output IO is at after writing the entry header
|
216
216
|
def add_empty_directory(dirname:, modification_time: Time.now.utc, unix_permissions: nil)
|
217
217
|
add_file_and_write_local_header(filename: dirname.to_s + "/",
|
@@ -262,13 +262,12 @@ class ZipKit::Streamer
|
|
262
262
|
#
|
263
263
|
# @param filename[String] the name of the file in the archive
|
264
264
|
# @param modification_time [Time] the modification time of the file in the archive
|
265
|
-
# @param unix_permissions[
|
266
|
-
# @
|
267
|
-
# sink[#<<, #write]
|
265
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
266
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
268
267
|
# an object that the file contents must be written to.
|
269
268
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
270
269
|
# output (using `IO.copy_stream` is a good approach).
|
271
|
-
# @return [
|
270
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
272
271
|
def write_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
273
272
|
writable = ZipKit::Streamer::Heuristic.new(self, filename, modification_time: modification_time, unix_permissions: unix_permissions)
|
274
273
|
yield_or_return_writable(writable, &blk)
|
@@ -313,13 +312,12 @@ class ZipKit::Streamer
|
|
313
312
|
#
|
314
313
|
# @param filename[String] the name of the file in the archive
|
315
314
|
# @param modification_time [Time] the modification time of the file in the archive
|
316
|
-
# @param unix_permissions[
|
317
|
-
# @
|
318
|
-
# sink[#<<, #write]
|
315
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
316
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
319
317
|
# an object that the file contents must be written to.
|
320
318
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
321
319
|
# output (using `IO.copy_stream` is a good approach).
|
322
|
-
# @return [
|
320
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
323
321
|
def write_stored_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
324
322
|
add_stored_entry(filename: filename,
|
325
323
|
modification_time: modification_time,
|
@@ -373,13 +371,12 @@ class ZipKit::Streamer
|
|
373
371
|
#
|
374
372
|
# @param filename[String] the name of the file in the archive
|
375
373
|
# @param modification_time [Time] the modification time of the file in the archive
|
376
|
-
# @param unix_permissions[
|
377
|
-
# @
|
378
|
-
# sink[#<<, #write]
|
374
|
+
# @param unix_permissions[Integer] which UNIX permissions to set, normally the default should be used
|
375
|
+
# @yieldparam sink[ZipKit::Streamer::Writable]
|
379
376
|
# an object that the file contents must be written to.
|
380
377
|
# Do not call `#close` on it - Streamer will do it for you. Write in chunks to achieve proper streaming
|
381
378
|
# output (using `IO.copy_stream` is a good approach).
|
382
|
-
# @return [
|
379
|
+
# @return [ZipKit::Streamer::Writable] without a block - the Writable sink which has to be closed manually
|
383
380
|
def write_deflated_file(filename, modification_time: Time.now.utc, unix_permissions: nil, &blk)
|
384
381
|
add_deflated_entry(filename: filename,
|
385
382
|
modification_time: modification_time,
|
data/lib/zip_kit/version.rb
CHANGED
data/lib/zip_kit/write_shovel.rb
CHANGED
@@ -13,8 +13,8 @@ module ZipKit::WriteShovel
|
|
13
13
|
# Writes the given data to the output stream. Allows the object to be used as
|
14
14
|
# a target for `IO.copy_stream(from, to)`
|
15
15
|
#
|
16
|
-
# @param
|
17
|
-
# @return [Fixnum] the number of bytes written
|
16
|
+
# @param bytes[String] the binary string to write (part of the uncompressed file)
|
17
|
+
# @return [Fixnum] the number of bytes written (will always be the bytesize of `bytes`)
|
18
18
|
def write(bytes)
|
19
19
|
self << bytes
|
20
20
|
bytes.bytesize
|