zip_tricks 5.3.1 → 5.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a9e1ad9dd7452add6a2f9187edecdd366699d81fba9260692ce2a96003401eca
4
- data.tar.gz: 1f2665b7c3a5c99ce190b5dab76fa9334ac4c6a3307928bab94758acd98bcb56
3
+ metadata.gz: 05e8eea8ecf1ad0b9b9cb132c54dde1abd60dc003afd1e5a1ce989786ce89608
4
+ data.tar.gz: fbdc2172fc3becefa4dd8720713acb11d838900465ac3cb42de38fec76fd0d90
5
5
  SHA512:
6
- metadata.gz: 8864d6d56796e49e2024e46aee7cd1ac1c8ea575d757a6fbe6fd62a6e77db94eef6dc5b80bdbeae5bb932536e889fbdab224642862452f5b1acc0020e8a8c906
7
- data.tar.gz: 67bc05db334c9a54804ac460f14f5919b9049913ee75532da090405f9a5b1f7e792ff4678aec0683a8b2a3136a9799ea3dd88aa3c9995491b28e4e6d88c4ac3a
6
+ metadata.gz: 449c59e898d2b54a089b60d7aebe7633d9f65ad64a5ce014a7a44e1683f6d6cec4997f732fd06746edeec96594d5041b68efcf19571ef92c9798acc05ffda7bd
7
+ data.tar.gz: 5ed26109e12373acfb9866531ef03c4302141bbb8a312f759ff768295c8d7926f650745f3f2624ad652333e57cd5bc5f98f932ace41504aeca45a668ef7a4f4a
@@ -4,8 +4,5 @@ rvm:
4
4
  - jruby-9.0
5
5
  sudo: false
6
6
  cache: bundler
7
- matrix:
8
- allow_failures:
9
- - rvm: jruby-9.0
10
7
  script:
11
8
  - bundle exec rake
@@ -1,3 +1,16 @@
1
+ ## 5.4.0
2
+
3
+ * Use block form for zlib Deflater calls to conserve memory
4
+ * Do not change string encoding in writer wrappers (avoid extra work)
5
+ * Fix a zlib deflater object being leaked per archived file
6
+ * Speed up streaming CRC32 computation
7
+ * When running tests, assign the port for the Puma server dynamically
8
+ * Reduce string allocations in the block deflate spec
9
+ * Make sure RemoteUncap specs run under JRuby correctly
10
+ * Replace Rails::Live streaming with iterable body streaming to avoid issues with Rails::Live across the board
11
+ * Remove `qa/` directory and scripts, as the tests for the library proper should now be sufficient
12
+ * Fix some documentation and sample code omissions and inconsistencies.
13
+
1
14
  ## 5.3.1
2
15
 
3
16
  * Fix extended timestamp timestamp value encoding. Previously we would use an incorrect encoding for the timestamp value, which would output correct but nonsensical timestamps. The pack specifier is now changed to output the correct value.
data/README.md CHANGED
@@ -24,11 +24,11 @@ to [32 bit sizes.](https://github.com/jruby/jruby/issues/3817)
24
24
 
25
25
  ## Diving in: send some large CSV reports from Rails
26
26
 
27
- The easiest is to use the Rails' built-in streaming feature:
27
+ The easiest is to include the `ZipTricks::RailsStreaming` module into your
28
+ controller.
28
29
 
29
30
  ```ruby
30
31
  class ZipsController < ActionController::Base
31
- include ActionController::Live # required for streaming
32
32
  include ZipTricks::RailsStreaming
33
33
 
34
34
  def download
@@ -49,6 +49,10 @@ class ZipsController < ActionController::Base
49
49
  end
50
50
  ```
51
51
 
52
+ If you want some more conveniences you can also use [zipline](https://github.com/fringd/zipline) which
53
+ will automatically process and stream attachments (Carrierwave, Shrine, ActiveStorage) and remote objects
54
+ via HTTP.
55
+
52
56
  ## Create a ZIP file without size estimation, compress on-the-fly during writes
53
57
 
54
58
  Basic use case is compressing on the fly. Some data will be buffered by the Zlib deflater, but
@@ -84,7 +88,7 @@ body = ZipTricks::RackBody.new do | zip |
84
88
  File.open('novel.txt', 'rb'){|source| IO.copy_stream(source, sink) }
85
89
  end
86
90
  end
87
- [200, {'Transfer-Encoding' => 'chunked'}, body]
91
+ [200, {}, body]
88
92
  ```
89
93
 
90
94
  ## Send a ZIP file of known size, with correct headers
@@ -174,9 +178,12 @@ that have not been formally verified (ours hasn't been).
174
178
  * Commit and push until you are happy with your contribution.
175
179
  * Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
176
180
  * Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
177
- * If you alter the `ZipWriter`, please take the time to run the test in the `qa/` directory. Ensure that the generated (large) files open manually - see README_QA for more.
178
181
 
179
- ## Copyright
182
+ ## Copyright and license
183
+
184
+ Copyright (c) 2020 WeTransfer.
180
185
 
181
- Copyright (c) 2019 WeTransfer. `zip_tricks` is distributed under the conditions of the [Hippocratic License](https://firstdonoharm.dev/version/1/2/license.html)
182
- - See LICENSE.txt for further details.
186
+ `zip_tricks` is distributed under the conditions of the [Hippocratic License](https://firstdonoharm.dev/version/1/2/license.html)
187
+ See LICENSE.txt for further details. If this license is not acceptable for your use case we still maintain the 4.x version tree
188
+ which remains under the MIT license, see https://rubygems.org/gems/zip_tricks/versions for more information.
189
+ Note that we only backport some performance optimizations and crucial bugfixes but not the new features to that tree.
@@ -1,11 +1,25 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- # Stashes a block given by the Rack webserver when calling each() on a body, and calls
4
- # that block every time it is written to using :<< (shovel). Poses as an IO for rubyzip.
5
-
3
+ # Acts as a converter between callers which send data to the `#<<` method (such as all the ZipTricks
4
+ # writer methods, which push onto anything), and a given block. Every time `#<<` gets called on the BlockWrite,
5
+ # the block given to the constructor will be called with the same argument. ZipTricks uses this object
6
+ # when integrating with Rack and in the OutputEnumerator. Normally you wouldn't need to use it manually but
7
+ # you always can. BlockWrite will also ensure the binary string encoding is forced onto any string
8
+ # that passes through it.
9
+ #
10
+ # For example, you can create a Rack response body like so:
11
+ #
12
+ # class MyRackResponse
13
+ # def each
14
+ # writer = ZipTricks::BlockWrite.new {|chunk| yield(chunk) }
15
+ # writer << "Hello" << "world" << "!"
16
+ # end
17
+ # end
18
+ # [200, {}, MyRackResponse.new]
6
19
  class ZipTricks::BlockWrite
7
- # The block is the block given to each() of the Rack body, or other block you want
8
- # to receive the string chunks written by the zip compressor.
20
+ # Creates a new BlockWrite.
21
+ #
22
+ # @param block The block that will be called when this object receives the `<<` message
9
23
  def initialize(&block)
10
24
  @block = block
11
25
  end
@@ -17,26 +31,17 @@ class ZipTricks::BlockWrite
17
31
  end
18
32
  end
19
33
 
20
- # Every time this object gets written to, call the Rack body each() block
21
- # with the bytes given instead.
34
+ # Sends a string through to the block stored in the BlockWrite.
35
+ #
36
+ # @param buf[String] the string to write. Note that a zero-length String
37
+ # will not be forwarded to the block, as it has special meaning when used
38
+ # with chunked encoding (it indicates the end of the stream).
39
+ # @return self
22
40
  def <<(buf)
23
41
  # Zero-size output has a special meaning when using chunked encoding
24
42
  return if buf.nil? || buf.bytesize.zero?
25
43
 
26
- # Ensure we ALWAYS write in binary encoding.
27
- encoded =
28
- if buf.encoding != Encoding::BINARY
29
- # If we got a frozen string we can't force_encoding on it
30
- begin
31
- buf.force_encoding(Encoding::BINARY)
32
- rescue
33
- buf.dup.force_encoding(Encoding::BINARY)
34
- end
35
- else
36
- buf
37
- end
38
-
39
- @block.call(encoded)
44
+ @block.call(buf.b)
40
45
  self
41
46
  end
42
47
  end
@@ -19,7 +19,7 @@ require 'stringio'
19
19
  # ## Usage
20
20
  #
21
21
  # File.open('zipfile.zip', 'rb') do |f|
22
- # entries = FileReader.read_zip_structure(f)
22
+ # entries = ZipTricks::FileReader.read_zip_structure(io: f)
23
23
  # entries.each do |e|
24
24
  # File.open(e.filename, 'wb') do |extracted_file|
25
25
  # ex = e.extractor_from(f)
@@ -281,7 +281,7 @@ class ZipTricks::FileReader
281
281
  seek(io, next_local_header_offset) # Seek to the next entry, and raise if seek is impossible
282
282
  end
283
283
  entries
284
- rescue ReadError
284
+ rescue ReadError, RangeError # RangeError is raised if offset exceeds int32/int64 range
285
285
  log do
286
286
  'Got a read/seek error after reaching %<cur_offset>d, no more entries can be recovered' %
287
287
  {cur_offset: cur_offset}
@@ -365,7 +365,7 @@ class ZipTricks::FileReader
365
365
  # (read starting at this offset to get the data).
366
366
  #
367
367
  # @param io[#seek, #read] an IO-ish object the ZIP file can be read from
368
- # @param local_header_offset[Fixnum] absolute offset (0-based) where the
368
+ # @param local_file_header_offset[Fixnum] absolute offset (0-based) where the
369
369
  # local file header is supposed to begin @return [Fixnum] absolute offset
370
370
  # (0-based) of where the compressed data begins for this file within the ZIP
371
371
  def get_compressed_data_offset(io:, local_file_header_offset:)
@@ -377,7 +377,7 @@ class ZipTricks::FileReader
377
377
  # Parse an IO handle to a ZIP archive into an array of Entry objects, reading from the end
378
378
  # of the IO object.
379
379
  #
380
- # @see {#read_zip_structure}
380
+ # @see #read_zip_structure
381
381
  # @param options[Hash] any options the instance method of the same name accepts
382
382
  # @return [Array<ZipEntry>] an array of entries within the ZIP being parsed
383
383
  def self.read_zip_structure(**options)
@@ -387,7 +387,7 @@ class ZipTricks::FileReader
387
387
  # Parse an IO handle to a ZIP archive into an array of Entry objects, reading from the start of
388
388
  # the file and parsing local file headers one-by-one
389
389
  #
390
- # @see {#read_zip_straight_ahead}
390
+ # @see #read_zip_straight_ahead
391
391
  # @param options[Hash] any options the instance method of the same name accepts
392
392
  # @return [Array<ZipEntry>] an array of entries within the ZIP being parsed
393
393
  def self.read_zip_straight_ahead(**options)
@@ -4,7 +4,7 @@
4
4
  # write operations, but want to discard the data (like when
5
5
  # estimating the size of a ZIP)
6
6
  module ZipTricks::NullWriter
7
- # @param data[String] the data to write
7
+ # @param _[String] the data to write
8
8
  # @return [self]
9
9
  def self.<<(_)
10
10
  self
@@ -32,6 +32,10 @@
32
32
  # conflict is avoided. This is not possible to apply to directories, because when one of the
33
33
  # path components is reused in multiple filenames it means those entities should end up in
34
34
  # the same directory (subdirectory) once the archive is opened.
35
+ #
36
+ # The `PathSet` keeps track of entries as they get added using 2 Sets (cheap presence checks),
37
+ # one for directories and one for files. It will raise a `Conflict` exception if there are
38
+ # files clobbering one another, or in case files collide with directories.
35
39
  class ZipTricks::PathSet
36
40
  class Conflict < StandardError
37
41
  end
@@ -6,17 +6,16 @@ module ZipTricks::RailsStreaming
6
6
  # Opens a {ZipTricks::Streamer} and yields it to the caller. The output of the streamer
7
7
  # gets automatically forwarded to the Rails response stream. When the output completes,
8
8
  # the Rails response stream is going to be closed automatically.
9
+ # @param zip_streamer_options[Hash] options that will be passed to the Streamer.
10
+ # See {ZipTricks::Streamer#initialize} for the full list of options.
9
11
  # @yield [Streamer] the streamer that can be written to
10
- def zip_tricks_stream
12
+ # @return [ZipTricks::OutputEnumerator] The output enumerator assigned to the response body
13
+ def zip_tricks_stream(**zip_streamer_options, &zip_streaming_blk)
11
14
  # Set a reasonable content type
12
15
  response.headers['Content-Type'] = 'application/zip'
13
16
  # Make sure nginx buffering is suppressed - see https://github.com/WeTransfer/zip_tricks/issues/48
14
17
  response.headers['X-Accel-Buffering'] = 'no'
15
- # Create a wrapper for the write call that quacks like something you
16
- # can << to, used by ZipTricks
17
- w = ZipTricks::BlockWrite.new { |chunk| response.stream.write(chunk) }
18
- ZipTricks::Streamer.open(w) { |z| yield(z) }
19
- ensure
20
- response.stream.close
18
+ response.sending_file = true
19
+ self.response_body = ZipTricks::OutputEnumerator.new(**zip_streamer_options, &zip_streaming_blk)
21
20
  end
22
21
  end
@@ -27,7 +27,7 @@ class ZipTricks::StreamCRC32
27
27
 
28
28
  # Creates a new streaming CRC32 calculator
29
29
  def initialize
30
- @crc = Zlib.crc32('')
30
+ @crc = Zlib.crc32
31
31
  end
32
32
 
33
33
  # Append data to the CRC32. Updates the contained CRC32 value in place.
@@ -35,7 +35,7 @@ class ZipTricks::StreamCRC32
35
35
  # @param blob[String] the string to compute the CRC32 from
36
36
  # @return [self]
37
37
  def <<(blob)
38
- @crc = Zlib.crc32_combine(@crc, Zlib.crc32(blob), blob.bytesize)
38
+ @crc = Zlib.crc32(blob, @crc)
39
39
  self
40
40
  end
41
41
 
@@ -131,7 +131,7 @@ class ZipTricks::Streamer
131
131
  # end
132
132
  #
133
133
  # @param kwargs_for_new [Hash] keyword arguments for {Streamer.new}
134
- # @return [Enumerator] the enumerator you can read bytestrings of the ZIP from using `each`
134
+ # @return [ZipTricks::OutputEnumerator] the enumerator you can read bytestrings of the ZIP from by calling `each`
135
135
  def self.output_enum(**kwargs_for_new, &zip_streamer_block)
136
136
  ZipTricks::OutputEnumerator.new(**kwargs_for_new, &zip_streamer_block)
137
137
  end
@@ -4,13 +4,6 @@
4
4
  # registers data passing through it in a CRC32 checksum calculator. Is made to be completely
5
5
  # interchangeable with the StoredWriter in terms of interface.
6
6
  class ZipTricks::Streamer::DeflatedWriter
7
- # After how many bytes of incoming data the deflater for the
8
- # contents must be flushed. This is done to prevent unreasonable
9
- # memory use when archiving large files, and to ensure we write to
10
- # the socket often enough while still maintaining acceptable
11
- # compression
12
- FLUSH_EVERY_N_BYTES = 1024 * 1024 * 5
13
-
14
7
  # The amount of bytes we will buffer before computing the intermediate
15
8
  # CRC32 checksums. Benchmarks show that the optimum is 64KB (see
16
9
  # `bench/buffered_crc32_bench.rb), if that is exceeded Zlib is going
@@ -18,11 +11,9 @@ class ZipTricks::Streamer::DeflatedWriter
18
11
  CRC32_BUFFER_SIZE = 64 * 1024
19
12
 
20
13
  def initialize(io)
21
- @compressed_io = ZipTricks::WriteAndTell.new(io)
22
- @uncompressed_size = 0
14
+ @compressed_io = io
23
15
  @deflater = ::Zlib::Deflate.new(Zlib::DEFAULT_COMPRESSION, -::Zlib::MAX_WBITS)
24
16
  @crc = ZipTricks::WriteBuffer.new(ZipTricks::StreamCRC32.new, CRC32_BUFFER_SIZE)
25
- @bytes_since_last_flush = 0
26
17
  end
27
18
 
28
19
  # Writes the given data into the deflater, and flushes the deflater
@@ -31,13 +22,8 @@ class ZipTricks::Streamer::DeflatedWriter
31
22
  # @param data[String] data to be written
32
23
  # @return self
33
24
  def <<(data)
34
- @uncompressed_size += data.bytesize
35
- @bytes_since_last_flush += data.bytesize
36
- @compressed_io << @deflater.deflate(data)
25
+ @deflater.deflate(data) { |chunk| @compressed_io << chunk }
37
26
  @crc << data
38
-
39
- interim_flush
40
-
41
27
  self
42
28
  end
43
29
 
@@ -45,18 +31,11 @@ class ZipTricks::Streamer::DeflatedWriter
45
31
  # compressed data written and the CRC32 checksum. The return value
46
32
  # can be directly used as the argument to {Streamer#update_last_entry_and_write_data_descriptor}
47
33
  #
48
- # @param data[String] data to be written
49
34
  # @return [Hash] a hash of `{crc32, compressed_size, uncompressed_size}`
50
35
  def finish
51
36
  @compressed_io << @deflater.finish until @deflater.finished?
52
- {crc32: @crc.to_i, compressed_size: @compressed_io.tell, uncompressed_size: @uncompressed_size}
53
- end
54
-
55
- private
56
-
57
- def interim_flush
58
- return if @bytes_since_last_flush < FLUSH_EVERY_N_BYTES
59
- @compressed_io << @deflater.flush
60
- @bytes_since_last_flush = 0
37
+ {crc32: @crc.to_i, compressed_size: @deflater.total_out, uncompressed_size: @deflater.total_in}
38
+ ensure
39
+ @deflater.close
61
40
  end
62
41
  end
@@ -28,7 +28,6 @@ class ZipTricks::Streamer::StoredWriter
28
28
  # Returns the amount of data written and the CRC32 checksum. The return value
29
29
  # can be directly used as the argument to {Streamer#update_last_entry_and_write_data_descriptor}
30
30
  #
31
- # @param data[String] data to be written
32
31
  # @return [Hash] a hash of `{crc32, compressed_size, uncompressed_size}`
33
32
  def finish
34
33
  {crc32: @crc.to_i, compressed_size: @io.tell, uncompressed_size: @io.tell}
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ZipTricks
4
- VERSION = '5.3.1'
4
+ VERSION = '5.4.0'
5
5
  end
@@ -10,9 +10,8 @@ class ZipTricks::WriteAndTell
10
10
 
11
11
  def <<(bytes)
12
12
  return self if bytes.nil?
13
- binary_bytes = binary(bytes)
14
- @io << binary_bytes
15
- @pos += binary_bytes.bytesize
13
+ @io << bytes.b
14
+ @pos += bytes.bytesize
16
15
  self
17
16
  end
18
17
 
@@ -23,13 +22,4 @@ class ZipTricks::WriteAndTell
23
22
  def tell
24
23
  @pos
25
24
  end
26
-
27
- private
28
-
29
- def binary(str)
30
- return str if str.encoding == Encoding::BINARY
31
- str.force_encoding(Encoding::BINARY)
32
- rescue RuntimeError # the string is frozen
33
- str.dup.force_encoding(Encoding::BINARY)
34
- end
35
25
  end
@@ -195,7 +195,7 @@ class ZipTricks::ZipWriter
195
195
  [TWO_BYTE_MAX_UINT].pack(C_UINT2)
196
196
  else
197
197
  [0].pack(C_UINT2)
198
- end
198
+ end
199
199
  io << [0].pack(C_UINT2) # internal file attributes 2 bytes
200
200
 
201
201
  # Because the add_empty_directory method will create a directory with a trailing "/",
@@ -11,7 +11,7 @@ Gem::Specification.new do |spec|
11
11
  spec.licenses = ['MIT (Hippocratic)']
12
12
  spec.summary = 'Stream out ZIP files from Ruby'
13
13
  spec.description = 'Stream out ZIP files from Ruby'
14
- spec.homepage = 'http://github.com/wetransfer/zip_tricks'
14
+ spec.homepage = 'https://github.com/wetransfer/zip_tricks'
15
15
 
16
16
  # Prevent pushing this gem to RubyGems.org.
17
17
  # To allow pushes either set the 'allowed_push_host'
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: zip_tricks
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.3.1
4
+ version: 5.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Julik Tarkhanov
@@ -11,7 +11,7 @@ authors:
11
11
  autorequire:
12
12
  bindir: exe
13
13
  cert_chain: []
14
- date: 2020-06-16 00:00:00.000000000 Z
14
+ date: 2020-11-19 00:00:00.000000000 Z
15
15
  dependencies:
16
16
  - !ruby/object:Gem::Dependency
17
17
  name: bundler
@@ -262,17 +262,8 @@ files:
262
262
  - lib/zip_tricks/write_and_tell.rb
263
263
  - lib/zip_tricks/write_buffer.rb
264
264
  - lib/zip_tricks/zip_writer.rb
265
- - qa/README_QA.md
266
- - qa/generate_test_files.rb
267
- - qa/in/VTYL8830.jpg
268
- - qa/in/war-and-peace.txt
269
- - qa/support.rb
270
- - qa/test-report-2016-07-28.txt
271
- - qa/test-report-2016-12-12.txt
272
- - qa/test-report-2017-04-2.txt
273
- - qa/test-report.txt
274
265
  - zip_tricks.gemspec
275
- homepage: http://github.com/wetransfer/zip_tricks
266
+ homepage: https://github.com/wetransfer/zip_tricks
276
267
  licenses:
277
268
  - MIT (Hippocratic)
278
269
  metadata:
@@ -1,16 +0,0 @@
1
- ## Manual testing harness for ZipTricks
2
-
3
- These tests will generate **very large** files that test various edge cases of ZIP generation. The idea is to generate
4
- these files and to then try to open them with the unarchiver applications we support. The workflow is as follows:
5
-
6
-
7
- 1. Configure your storage to have `zip_tricks` directory linked into your virtual machines and to be on a fast volume (SSD RAID0 is recommended)
8
- 2. Run `generate_test_files.rb`. This will take some time and produce a number of large ZIP files.
9
- 3. Open them with the following ZIP unarchivers:
10
- * A recent version of `zipinfo` with the `-tlhvz` flags - to see the information about the file
11
- * ArchiveUtility on OSX
12
- * The Unarchiver on OSX
13
- * Built-in Explorer on Windows 7
14
- * 7Zip 9.20 on Windows 7
15
- * Any other unarchivers you consider necessary
16
- * Write down your observations in `test-report.txt` and, when cutting a release, timestamp a copy of that file.