zip_tricks 2.8.0 → 2.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 84302e70bc873418b8a458456a75e0da7b5bc67d
4
- data.tar.gz: 06f81fc860d5cf77fdbd4b570602ae4984297d43
3
+ metadata.gz: 05fb68904433a8f06dd50d668c29d3c0cf8780f2
4
+ data.tar.gz: af0dfe83697cc9cd6bb4ed1a8714cffbb3ceedd7
5
5
  SHA512:
6
- metadata.gz: 932fe6e3a095f43996505c642fce11009efba4298302e958f5fdf7649cd3ca9fd28ab1de6a33d1e20be2955a7efd58b4d432486d3004ab70437a55413c27934d
7
- data.tar.gz: b9d43dbf16156cd3c877a8353dd0dd4a34e4c8b2e43664e34c4c3f2562c3a9937390bf398b0278449bf338f89337c348f30da637414e61b17eab58a939b05fc5
6
+ metadata.gz: 4b583b966eb87b502f428bd43d7bd9d9c800112bb3c256aa63a8cda6dcfb02052cf653de2df4022e23f49db37e157d81d336b0db8c8527c88f6e2de2e8974172
7
+ data.tar.gz: 22f263d209256d28c552c6fe04b9981d1181eb7f062233f7bc407146987faf2d1f14db9b8902ddf89d4038680839971291c5e47d8b57cbc8ed248a9360bf431b
@@ -0,0 +1,106 @@
1
+ # Implementation details
2
+
3
+ The ZipTricks streaming implementation is designed around the following requirements:
4
+
5
+ * Only ahead-writes (no IO seek or rewind)
6
+ * Automatic switching to Zip64 as the files get written (no IO seeks), but not requiring Zip64 support if the archive can do without
7
+ * Make use of the fact that CRC32 checksums and the sizes of the files (compressed _and_ uncompressed) are known upfront
8
+
9
+ It strives to be compatible with the following unzip programs _at the minimum:_
10
+
11
+ * OSX - builtin ArchiveUtility (except the Zip64 support when files larger than 4GB are in the archive)
12
+ * OSX - The Unarchiver, at least 3.10.1
13
+ * Windows 7 - built-in Explorer zip browser (except for Unicode filenames which it just doesn't support)
14
+ * Windows 7 - 7Zip 9.20
15
+
16
+ Below is the list of _specific_ decisions taken when writing the implementation, with an explanation for each.
17
+ We specifically _omit_ a number of things that we could do, but that are not necessary to satisfy our objectives.
18
+ The omissions are _intentional_ since we do not want to have things of which we _assume_ they work, or have things
19
+ that work only for one obscure unarchiver in one obscure case (like WinRAR with chinese filenames).
20
+
21
+ ## Data descriptors (postfix CRC32/file sizes)
22
+
23
+ Data descriptors permit you to generate "postfix" ZIP files (where you write the local file header without having to
24
+ know the CRC32 and the file size upfront, then write the compressed file data, and only then - once you know what your CRC32,
25
+ compressed and uncompressed sizes are etc. - write them into a data descriptor that follows the file data.
26
+
27
+ The streamer does _not_ use data descriptors, because their use [is problematic](https://github.com/thejoshwolfe/yazl/issues/13)
28
+ with the 7Zip version that we want to support. Or rather - not the use of data descriptors themselves, but the use of the GP flag
29
+ bit 3 that trips up that version of 7Zip. If we were to use data descriptors, we would have to up the minimum supported version
30
+ of 7Zip.
31
+
32
+ That means, in turn, that **to use the ZipTricks streamer you have to know the CRC32 and the sizes of the compressed/uncompressed
33
+ file upfront.** So you have to precompute them in some way. To do that, you can use `BlockDeflate` to precompress the file in
34
+ parallel, and `StreamCRC32` to compute the CRC checksum, before feeding them to the ZIP writer.
35
+
36
+ This approach might be reconsidered in the future.
37
+
38
+ For more info see https://github.com/thejoshwolfe/yazl#general-purpose-bit-flag
39
+
40
+ ## Zip64 support
41
+
42
+ Zip64 support switches on _by itself_, automatically, when _any_ of the following conditions is met:
43
+
44
+ * The start of the central directory lies beyound the 4GB limit
45
+ * The ZIP archive has more than 65535 files added to it
46
+ * Any entry is present whose compressed _or_ uncompressed size is above 4GB
47
+
48
+ When writing out local file headers, the Zip64 extra field (and related changes to the standard fields) are
49
+ _only_ performed if one of the file sizes is larger than 4GB. Otherwise the Zip64 extra will _only_ be
50
+ written in the central directory entry, but not in the local file header.
51
+
52
+ This has to do with the fact that otherwise we would write Zip64 extra fields for all local file headers,
53
+ regardless whether the file actually requires Zip64 or not. That might impede some older tools from reading
54
+ the archive, which is a problem you don't want to have if your archive otherwise fits perfectly below all
55
+ the Zip64 thresholds.
56
+
57
+ To be compatible with Windows7 built-in tools, the Zip64 extra field _must_ be written as _the first_ extra
58
+ field, any other extra fields should come after.
59
+
60
+ ## International filename support and the Info-ZIP extra field
61
+
62
+ If a diacritic-containing character (such as å) does fit into the DOS-437
63
+ codepage, it should be encodable as such. This would, in theory, let older Windows tools
64
+ decode the filename correctly. However, this kills the filename decoding for the OSX builtin
65
+ archive utility (it assumes the filename to be UTF-8, regardless). So if we allow filenames
66
+ to be encoded in DOS-437, we _potentially_ have support in Windows but we upset everyone on Mac.
67
+ If we just use UTF-8 and set the right EFS bit in general purpose flags, we upset Windows users
68
+ because most of the Windows unarchive tools (at least the builtin ones) do not give a flying eff
69
+ about the EFS support bit being set.
70
+
71
+ Additionally, if we use Unarchiver on OSX (which is our recommended unpacker for large files),
72
+ it will (very rightfully) ask us how we should decode each filename that does not have the EFS bit,
73
+ but does contain something non-ASCII-decodable. This is horrible UX for users.
74
+
75
+ So, basically, we have 2 choices, for filenames containing diacritics (for bona-fide UTF-8 you do not
76
+ even get those choices, you _have_ to use UTF-8):
77
+
78
+ * Make life easier for Windows users by setting stuff to DOS, not care about the standard _and_ make
79
+ most of Mac users upset
80
+ * Make life easy for Mac users and conform to the standard, and tell Windows users to get a _decent_
81
+ ZIP unarchiving tool.
82
+
83
+ We are going with option 2, and this is well-thought-out. Trust me. If you want the crazytown
84
+ filename encoding scheme that is described here http://stackoverflow.com/questions/13261347
85
+ you can try this:
86
+
87
+ [Encoding::CP437, Encoding::ISO_8859_1, Encoding::UTF_8]
88
+
89
+ We don't want no such thing, and sorry Windows users, you are going to need a decent unarchiver
90
+ that honors the standard. Alas, alas.
91
+
92
+ Additionally, the tests with the unarchivers we _do_ support have shown that including the InfoZIP
93
+ extra field does not actually help any of them recognize the file name correctly. And the use of
94
+ those fields for the UTF-8 filename, per spec, tells us we should not set the EFS bit - which ruins
95
+ the unarchiving for all other solutions. As any other, this decision may be changed in the future.
96
+
97
+ There are some interesting notes about the Info-ZIP/EFS combination here
98
+ https://commons.apache.org/proper/commons-compress/zip.html
99
+
100
+ ## Directory support
101
+
102
+ ZIP makes it possible to store empty directories (folders). For our purposes, however, we are going
103
+ to store only the files. If you store a file, called, say, `docs/item.doc` then the unarchiver will
104
+ automatically create the `docs` directory if it doesn't exist already. You can also store an entry
105
+ with a length of 0 and set it's external attributes to be an empty directory, but we do not need
106
+ that functionality - so it is also omitted.
@@ -13,6 +13,7 @@ class ZipTricks::Microzip
13
13
  DEFLATED = 8
14
14
 
15
15
  TooMuch = Class.new(StandardError)
16
+ PathError = Class.new(StandardError)
16
17
  DuplicateFilenames = Class.new(StandardError)
17
18
  UnknownMode = Class.new(StandardError)
18
19
 
@@ -42,21 +43,14 @@ class ZipTricks::Microzip
42
43
  C_v = 'v'.freeze
43
44
  C_Qe = 'Q<'.freeze
44
45
 
45
- module Bytesize
46
- def bytesize_of
47
- ''.force_encoding(Encoding::BINARY).tap {|b| yield(b) }.bytesize
48
- end
49
- end
50
- include Bytesize
51
-
52
46
  class Entry < Struct.new(:filename, :crc32, :compressed_size, :uncompressed_size, :storage_mode, :mtime)
53
- include Bytesize
54
47
  def initialize(*)
55
48
  super
49
+ filename.force_encoding(Encoding::UTF_8)
50
+ @requires_efs_flag = !(filename.encode(Encoding::ASCII) rescue false)
56
51
  @requires_zip64 = (compressed_size > FOUR_BYTE_MAX_UINT || uncompressed_size > FOUR_BYTE_MAX_UINT)
57
- if filename.bytesize > TWO_BYTE_MAX_UINT
58
- raise TooMuch, "The given filename is too long to fit (%d bytes)" % filename.bytesize
59
- end
52
+ raise TooMuch, "Filename is too long" if filename.bytesize > TWO_BYTE_MAX_UINT
53
+ raise PathError, "Paths in ZIP may only contain forward slashes (UNIX separators)" if filename.include?('\\')
60
54
  end
61
55
 
62
56
  def requires_zip64?
@@ -67,41 +61,8 @@ class ZipTricks::Microzip
67
61
  # bit (bit 11) which should be set if the filename is UTF8. If it is, we need to set the
68
62
  # bit so that the unarchiving application knows that the filename in the archive is UTF-8
69
63
  # encoded, and not some DOS default. For ASCII entries it does not matter.
70
- #
71
- # Now, strictly speaking, if a diacritic-containing character (such as å) does fit into the DOS-437
72
- # codepage, it should be encodable as such. This would, in theory, let older Windows tools
73
- # decode the filename correctly. However, this kills the filename decoding for the OSX builtin
74
- # archive utility (it assumes the filename to be UTF-8, regardless). So if we allow filenames
75
- # to be encoded in DOS-437, we _potentially_ have support in Windows but we upset everyone on Mac.
76
- # If we just use UTF-8 and set the right EFS bit in general purpose flags, we upset Windows users
77
- # because most of the Windows unarchive tools (at least the builtin ones) do not give a flying eff
78
- # about the EFS support bit being set.
79
- #
80
- # Additionally, if we use Unarchiver on OSX (which is our recommended unpacker for large files),
81
- # it will (very rightfully) ask us how we should decode each filename that does not have the EFS bit,
82
- # but does contain something non-ASCII-decodable. This is horrible UX for users.
83
- #
84
- # So, basically, we have 2 choices, for filenames containing diacritics (for bona-fide UTF-8 you do not
85
- # even get those choices, you _have_ to use UTF-8):
86
- #
87
- # * Make life easier for Windows users by setting stuff to DOS, not care about the standard _and_ make
88
- # most of Mac users upset
89
- # * Make life easy for Mac users and conform to the standard, and tell Windows users to get a _decent_
90
- # ZIP unarchiving tool.
91
- #
92
- # We are going with option 2, and this is well-thought-out. Trust me. If you want the crazytown
93
- # filename encoding scheme that is described here http://stackoverflow.com/questions/13261347
94
- # you can try this:
95
- #
96
- # [Encoding::CP437, Encoding::ISO_8859_1, Encoding::UTF_8]
97
- #
98
- # We don't want no such thing, and sorry Windows users, you are going to need a decent unarchiver
99
- # that honors the standard. Alas, alas.
100
64
  def gp_flags_based_on_filename
101
- filename.encode(Encoding::ASCII)
102
- 0b00000000000
103
- rescue EncodingError
104
- 0b00000000000 | 0b100000000000
65
+ @requires_efs_flag ? (0b00000000000 | 0b100000000000) : 0b00000000000
105
66
  end
106
67
 
107
68
  def write_local_file_header(io)
@@ -212,9 +173,16 @@ class ZipTricks::Microzip
212
173
  io << [extra_size].pack(C_v) # extra field length 2 bytes
213
174
 
214
175
  io << [0].pack(C_v) # file comment length 2 bytes
215
- io << [0].pack(C_v) # disk number start 2 bytes
216
- io << [0].pack(C_v) # internal file attributes 2 bytes
217
176
 
177
+ # For The Unarchiver < 3.11.1 this field has to be set to the overflow value if zip64 is used
178
+ # because otherwise it does not properly advance the pointer when reading the Zip64 extra field
179
+ # https://bitbucket.org/WAHa_06x36/theunarchiver/pull-requests/2/bug-fix-for-zip64-extra-field-parser/diff
180
+ if @requires_zip64
181
+ io << [TWO_BYTE_MAX_UINT].pack(C_v) # disk number start 2 bytes
182
+ else
183
+ io << [0].pack(C_v) # disk number start 2 bytes
184
+ end
185
+ io << [0].pack(C_v) # internal file attributes 2 bytes
218
186
  io << [DEFAULT_EXTERNAL_ATTRS].pack(C_V) # external file attributes 4 bytes
219
187
 
220
188
  if @requires_zip64
@@ -232,6 +200,10 @@ class ZipTricks::Microzip
232
200
 
233
201
  private
234
202
 
203
+ def bytesize_of
204
+ ''.force_encoding(Encoding::BINARY).tap {|b| yield(b) }.bytesize
205
+ end
206
+
235
207
  def to_binary_dos_time(t)
236
208
  (t.sec/2) + (t.min << 5) + (t.hour << 11)
237
209
  end
@@ -313,10 +285,10 @@ class ZipTricks::Microzip
313
285
  # offset of start of central
314
286
  # directory with respect to
315
287
  io << [start_of_central_directory].pack(C_Qe) # the starting disk number 8 bytes
316
- # zip64 extensible data sector (variable size)
288
+ # zip64 extensible data sector (variable size), blank for us
317
289
 
318
290
  # [zip64 end of central directory locator]
319
- io << [0x07064b50].pack("V") # zip64 end of central dir locator
291
+ io << [0x07064b50].pack(C_V) # zip64 end of central dir locator
320
292
  # signature 4 bytes (0x07064b50)
321
293
  io << [0].pack(C_V) # number of the disk with the
322
294
  # start of the zip64 end of
data/lib/zip_tricks.rb CHANGED
@@ -2,7 +2,7 @@ require 'zip'
2
2
  require 'very_tiny_state_machine'
3
3
 
4
4
  module ZipTricks
5
- VERSION = '2.8.0'
5
+ VERSION = '2.8.1'
6
6
 
7
7
  # Require all the sub-components except myself
8
8
  Dir.glob(__dir__ + '/**/*.rb').sort.each {|p| require p unless p == __FILE__ }
@@ -1,28 +1,35 @@
1
1
  require_relative '../spec_helper'
2
+ require_relative '../../testing/support'
2
3
 
3
4
  describe ZipTricks::Microzip do
4
5
  class ByteReader < Struct.new(:io)
5
6
  def read_2b
6
7
  io.read(2).unpack('v').first
7
8
  end
8
-
9
+
9
10
  def read_2c
10
11
  io.read(2).unpack('CC').first
11
12
  end
12
-
13
+
13
14
  def read_4b
14
15
  io.read(4).unpack('V').first
15
16
  end
16
-
17
+
17
18
  def read_8b
18
19
  io.read(8).unpack('Q<').first
19
20
  end
20
-
21
+
21
22
  def read_n(n)
22
23
  io.read(n)
23
24
  end
24
25
  end
25
-
26
+
27
+ class IOWrapper < ZipTricks::WriteAndTell
28
+ def read(n)
29
+ @io.read(n)
30
+ end
31
+ end
32
+
26
33
  it 'raises an exception if the filename is non-unique in the already existing set' do
27
34
  z = described_class.new
28
35
  z.add_local_file_header(io: StringIO.new, filename: 'foo.txt', crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
@@ -31,15 +38,23 @@ describe ZipTricks::Microzip do
31
38
  }.to raise_error(/already/)
32
39
  end
33
40
 
41
+ it 'raises an exception if the filename contains backward slashes' do
42
+ z = described_class.new
43
+ expect {
44
+ z.add_local_file_header(io: StringIO.new, filename: 'windows\not\welcome.txt',
45
+ crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
46
+ }.to raise_error(/UNIX/)
47
+ end
48
+
34
49
  it 'raises an exception if the filename does not fit in 0xFFFF bytes' do
35
50
  longest_filename_in_the_universe = "x" * (0xFFFF + 1)
36
51
  z = described_class.new
37
52
  expect {
38
- z.add_local_file_header(io: StringIO.new, filename: longest_filename_in_the_universe,
53
+ z.add_local_file_header(io: StringIO.new, filename: longest_filename_in_the_universe,
39
54
  crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
40
- }.to raise_error(/filename/)
55
+ }.to raise_error(/is too long/)
41
56
  end
42
-
57
+
43
58
  describe '#add_local_file_header' do
44
59
  it 'writes out the local file header for an entry that fits into a standard ZIP' do
45
60
  buf = StringIO.new
@@ -47,7 +62,7 @@ describe ZipTricks::Microzip do
47
62
  mtime = Time.utc(2016, 7, 17, 13, 48)
48
63
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 8981,
49
64
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
50
-
65
+
51
66
  buf.rewind
52
67
  br = ByteReader.new(buf)
53
68
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -64,28 +79,45 @@ describe ZipTricks::Microzip do
64
79
  expect(br.read_n('first-file.bin'.bytesize)).to eq('first-file.bin') # the filename
65
80
  expect(buf).to be_eof
66
81
  end
67
-
82
+
68
83
  it 'writes out the local file header for an entry with a UTF-8 filename, setting the proper GP flag bit' do
69
84
  buf = StringIO.new
70
85
  zip = described_class.new
71
86
  mtime = Time.utc(2016, 7, 17, 13, 48)
72
87
  zip.add_local_file_header(io: buf, filename: 'файл.bin', crc32: 123, compressed_size: 8981,
73
88
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
74
-
89
+
90
+ buf.rewind
91
+ br = ByteReader.new(buf)
92
+ br.read_4b # Signature
93
+ br.read_2b # Version needed to extract
94
+ expect(br.read_2b).to eq(2048) # gp flags
95
+ end
96
+
97
+ it "correctly recognizes UTF-8 filenames even if they are tagged as ASCII" do
98
+ name = 'файл.bin'
99
+ name.force_encoding(Encoding::US_ASCII)
100
+
101
+ buf = StringIO.new
102
+ zip = described_class.new
103
+ mtime = Time.utc(2016, 7, 17, 13, 48)
104
+ zip.add_local_file_header(io: buf, filename: name, crc32: 123, compressed_size: 8981,
105
+ uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
106
+
75
107
  buf.rewind
76
108
  br = ByteReader.new(buf)
77
109
  br.read_4b # Signature
78
110
  br.read_2b # Version needed to extract
79
111
  expect(br.read_2b).to eq(2048) # gp flags
80
112
  end
81
-
113
+
82
114
  it 'writes out the local file header for an entry with a filename with diacritics, setting the proper GP flag bit' do
83
115
  buf = StringIO.new
84
116
  zip = described_class.new
85
117
  mtime = Time.utc(2016, 7, 17, 13, 48)
86
118
  zip.add_local_file_header(io: buf, filename: 'Kungälv', crc32: 123, compressed_size: 8981,
87
119
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
88
-
120
+
89
121
  buf.rewind
90
122
  br = ByteReader.new(buf)
91
123
  br.read_4b # Signature
@@ -102,14 +134,14 @@ describe ZipTricks::Microzip do
102
134
  filename_readback = br.read_n('Kungälv'.bytesize)
103
135
  expect(filename_readback.force_encoding(Encoding::UTF_8)).to eq('Kungälv')
104
136
  end
105
-
137
+
106
138
  it 'writes out the local file header for an entry that requires Zip64 based on its compressed size _only_' do
107
139
  buf = StringIO.new
108
140
  zip = described_class.new
109
141
  mtime = Time.utc(2016, 7, 17, 13, 48)
110
142
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: (0xFFFFFFFF + 1),
111
143
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
112
-
144
+
113
145
  buf.rewind
114
146
  br = ByteReader.new(buf)
115
147
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -130,14 +162,14 @@ describe ZipTricks::Microzip do
130
162
  expect(br.read_8b).to eq(0xFFFFFFFF + 1) # True uncompressed size
131
163
  expect(buf).to be_eof
132
164
  end
133
-
165
+
134
166
  it 'writes out the local file header for an entry that requires Zip64 based on its uncompressed size _only_' do
135
167
  buf = StringIO.new
136
168
  zip = described_class.new
137
169
  mtime = Time.utc(2016, 7, 17, 13, 48)
138
170
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 90981,
139
171
  uncompressed_size: (0xFFFFFFFF + 1), storage_mode: 8, mtime: mtime)
140
-
172
+
141
173
  buf.rewind
142
174
  br = ByteReader.new(buf)
143
175
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -158,7 +190,7 @@ describe ZipTricks::Microzip do
158
190
  expect(br.read_8b).to eq(90981) # True compressed size
159
191
  expect(buf).to be_eof
160
192
  end
161
-
193
+
162
194
  it 'does not write out the Zip64 extra if the position in the destination IO is beyond the Zip64 size limit' do
163
195
  buf = StringIO.new
164
196
  zip = described_class.new
@@ -166,7 +198,7 @@ describe ZipTricks::Microzip do
166
198
  expect(buf).to receive(:tell).and_return(0xFFFFFFFF + 1)
167
199
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 123,
168
200
  uncompressed_size: 456, storage_mode: 8, mtime: mtime)
169
-
201
+
170
202
  buf.rewind
171
203
  br = ByteReader.new(buf)
172
204
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -182,14 +214,14 @@ describe ZipTricks::Microzip do
182
214
  expect(br.read_2b).to be_zero
183
215
  end
184
216
  end
185
-
217
+
186
218
  describe '#write_central_directory' do
187
- it 'can write the central directory and makes it a valid one even if there were no files' do
219
+ it 'writes the central directory and makes it a valid one even if there were no files' do
188
220
  buf = StringIO.new
189
-
221
+
190
222
  zip = described_class.new
191
223
  zip.write_central_directory(buf)
192
-
224
+
193
225
  buf.rewind
194
226
  br = ByteReader.new(buf)
195
227
  expect(br.read_4b).to eq(0x06054b50) # EOCD signature
@@ -202,35 +234,313 @@ describe ZipTricks::Microzip do
202
234
  expect(br.read_2b).to eq(0) # ZIP file comment length
203
235
  expect(buf).to be_eof
204
236
  end
205
-
237
+
206
238
  it 'writes the central directory for 2 files' do
207
239
  zip = described_class.new
208
-
240
+
209
241
  mtime = Time.utc(2016, 7, 17, 13, 48)
210
-
242
+
211
243
  buf = StringIO.new
212
244
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 5,
213
245
  uncompressed_size: 8, storage_mode: 8, mtime: mtime)
214
246
  buf << Random.new.bytes(5)
215
- zip.add_local_file_header(io: buf, filename: 'first-file.txt', crc32: 123, compressed_size: 9,
247
+ zip.add_local_file_header(io: buf, filename: 'second-file.txt', crc32: 546, compressed_size: 9,
216
248
  uncompressed_size: 9, storage_mode: 0, mtime: mtime)
217
249
  buf << Random.new.bytes(5)
218
-
250
+
219
251
  central_dir_offset = buf.tell
220
-
221
252
  zip.write_central_directory(buf)
222
-
253
+
223
254
  # Seek to where the central directory begins
224
255
  buf.rewind
225
256
  buf.seek(central_dir_offset)
226
-
257
+
227
258
  br = ByteReader.new(buf)
259
+
260
+ # Central directory entry for the first file
261
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
262
+ expect(br.read_2b).to eq(820) # version made by
263
+ expect(br.read_2b).to eq(20) # version need to extract
264
+ expect(br.read_2b).to eq(0) # general purpose bit flag
265
+ expect(br.read_2b).to eq(8) # compression method (deflated here)
266
+ expect(br.read_2b).to eq(28160) # last mod file time
267
+ expect(br.read_2b).to eq(18673) # last mod file date
268
+ expect(br.read_4b).to eq(123) # crc32
269
+ expect(br.read_4b).to eq(5) # compressed size
270
+ expect(br.read_4b).to eq(8) # uncompressed size
271
+ expect(br.read_2b).to eq(14) # filename length
272
+ expect(br.read_2b).to eq(0) # extra field length
273
+ expect(br.read_2b).to eq(0) # file comment
274
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
275
+ expect(br.read_2b).to eq(0) # internal file attributes
276
+ expect(br.read_4b).to eq(2175008768) # external file attributes
277
+ expect(br.read_4b).to eq(0) # relative offset of local header
278
+ expect(br.read_n(14)).to eq('first-file.bin') # the filename
279
+
280
+ # Central directory entry for the second file
228
281
  expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
282
+ expect(br.read_2b).to eq(820) # version made by
283
+ expect(br.read_2b).to eq(20) # version need to extract
284
+ expect(br.read_2b).to eq(0) # general purpose bit flag
285
+ expect(br.read_2b).to eq(0) # compression method (stored here)
286
+ expect(br.read_2b).to eq(28160) # last mod file time
287
+ expect(br.read_2b).to eq(18673) # last mod file date
288
+ expect(br.read_4b).to eq(546) # crc32
289
+ expect(br.read_4b).to eq(9) # compressed size
290
+ expect(br.read_4b).to eq(9) # uncompressed size
291
+ expect(br.read_2b).to eq('second-file.bin'.bytesize) # filename length
292
+ expect(br.read_2b).to eq(0) # extra field length
293
+ expect(br.read_2b).to eq(0) # file comment
294
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
295
+ expect(br.read_2b).to eq(0) # internal file attributes
296
+ expect(br.read_4b).to eq(2175008768) # external file attributes
297
+ expect(br.read_4b).to eq(49) # relative offset of local header
298
+ expect(br.read_n('second-file.txt'.bytesize)).to eq('second-file.txt') # the filename
299
+
300
+ expect(br.read_4b).to eq(0x06054b50) # end of central dir signature
301
+ br.read_2b
302
+ br.read_2b
303
+ br.read_2b
304
+ br.read_2b
305
+ br.read_4b
306
+ br.read_4b
307
+ br.read_2b
308
+
309
+ expect(buf).to be_eof
310
+ end
311
+
312
+ it 'writes the central directory for 1 file that is larger than 4GB' do
313
+ zip = described_class.new
314
+ buf = StringIO.new
315
+ big = 0xFFFFFFFF + 2048
316
+ mtime = Time.utc(2016, 7, 17, 13, 48)
317
+
318
+ zip.add_local_file_header(io: buf, filename: 'big-file.bin', crc32: 12345, compressed_size: big,
319
+ uncompressed_size: big, storage_mode: 0, mtime: mtime)
320
+
321
+ central_dir_offset = buf.tell
322
+
323
+ zip.write_central_directory(buf)
324
+
325
+ # Seek to where the central directory begins
326
+ buf.rewind
327
+ buf.seek(central_dir_offset)
328
+
329
+ br = ByteReader.new(buf)
330
+
331
+ # Standard central directory entry (similar to the local file header)
332
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
333
+ expect(br.read_2b).to eq(820) # version made by
334
+ expect(br.read_2b).to eq(45) # version need to extract (45 for Zip64)
335
+ expect(br.read_2b).to eq(0) # general purpose bit flag
336
+ expect(br.read_2b).to eq(0) # compression method (stored here)
337
+ expect(br.read_2b).to eq(28160) # last mod file time
338
+ expect(br.read_2b).to eq(18673) # last mod file date
339
+ expect(br.read_4b).to eq(12345) # crc32
340
+ expect(br.read_4b).to eq(0xFFFFFFFF) # compressed size
341
+ expect(br.read_4b).to eq(0xFFFFFFFF) # uncompressed size
342
+ expect(br.read_2b).to eq(12) # filename length
343
+ expect(br.read_2b).to eq(32) # extra field length (we store the Zip64 extra field for this file)
344
+ expect(br.read_2b).to eq(0) # file comment
345
+ expect(br.read_2b).to eq(0xFFFF) # disk number, must be blanked to the maximum value because of The Unarchiver bug
346
+ expect(br.read_2b).to eq(0) # internal file attributes
347
+ expect(br.read_4b).to eq(2175008768) # external file attributes
348
+ expect(br.read_4b).to eq(0xFFFFFFFF) # relative offset of local header
349
+ expect(br.read_n(12)).to eq('big-file.bin') # the filename
350
+
351
+ # Zip64 extra field
352
+ expect(br.read_2b).to eq(0x0001) # Tag for the "extra" block
353
+ expect(br.read_2b).to eq(28) # Size of this "extra" block. For us it will always be 28
354
+ expect(br.read_8b).to eq(big) # Original uncompressed file size
355
+ expect(br.read_8b).to eq(big) # Original compressed file size
356
+ expect(br.read_8b).to eq(0) # Offset of local header record
357
+ expect(br.read_4b).to eq(0) # Number of the disk on which this file starts
358
+ end
359
+
360
+ it 'writes the central directory for 2 files which, together, make the central directory start beyound the 4GB threshold' do
361
+ zip = described_class.new
362
+ raw_buf = StringIO.new
363
+
364
+ zip_write_buf = IOWrapper.new(raw_buf)
365
+ big1 = 0xFFFFFFFF/2 + 512
366
+ big2 = 0xFFFFFFFF/2 + 1024
367
+ mtime = Time.utc(2016, 7, 17, 13, 48)
368
+
369
+ zip.add_local_file_header(io: zip_write_buf, filename: 'first-big-file.bin', crc32: 12345, compressed_size: big1,
370
+ uncompressed_size: big1, storage_mode: 0, mtime: mtime)
371
+ zip_write_buf.advance_position_by(big1)
372
+
373
+ zip.add_local_file_header(io: zip_write_buf, filename: 'second-big-file.bin', crc32: 54321, compressed_size: big2,
374
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
375
+ zip_write_buf.advance_position_by(big2)
376
+
377
+ fake_central_dir_offset = zip_write_buf.tell # Grab the position in the underlying buffer
378
+ actual_central_dir_offset = raw_buf.tell # Grab the position in the underlying buffer
379
+
380
+ zip.write_central_directory(zip_write_buf)
381
+
382
+ # Seek to where the central directory begins
383
+ raw_buf.seek(actual_central_dir_offset, IO::SEEK_SET)
384
+
385
+ br = ByteReader.new(raw_buf)
386
+
387
+ # Standard central directory entry (similar to the local file header)
388
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
389
+ expect(br.read_2b).to eq(820) # version made by
390
+ expect(br.read_2b).to eq(20) # version need to extract (45 for Zip64)
391
+ expect(br.read_2b).to eq(0) # general purpose bit flag
392
+ expect(br.read_2b).to eq(0) # compression method (stored here)
393
+ expect(br.read_2b).to eq(28160) # last mod file time
394
+ expect(br.read_2b).to eq(18673) # last mod file date
395
+ expect(br.read_4b).to eq(12345) # crc32
396
+ expect(br.read_4b).to eq(2147484159) # compressed size
397
+ expect(br.read_4b).to eq(2147484159) # uncompressed size
398
+ expect(br.read_2b).to eq(18) # filename length
399
+ expect(br.read_2b).to eq(0) # extra field length
400
+ expect(br.read_2b).to eq(0) # file comment length
401
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
402
+ expect(br.read_2b).to eq(0) # internal file attributes
403
+ expect(br.read_4b).to eq(2175008768) # external file attributes
404
+ expect(br.read_4b).to eq(0) # relative offset of local header
405
+ expect(br.read_n(18)).to eq("first-big-file.bin") # the filename
406
+
407
+ # Standard central directory entry (similar to the local file header)
408
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
409
+ expect(br.read_2b).to eq(820) # version made by
410
+ expect(br.read_2b).to eq(20) # version need to extract (45 for Zip64)
411
+ expect(br.read_2b).to eq(0) # general purpose bit flag
412
+ expect(br.read_2b).to eq(0) # compression method (stored here)
413
+ expect(br.read_2b).to eq(28160) # last mod file time
414
+ expect(br.read_2b).to eq(18673) # last mod file date
415
+ expect(br.read_4b).to eq(54321) # crc32
416
+ expect(br.read_4b).to eq(2147484671) # compressed size
417
+ expect(br.read_4b).to eq(2147484671) # uncompressed size
418
+ expect(br.read_2b).to eq(19) # filename length
419
+ expect(br.read_2b).to eq(0) # extra field length
420
+ expect(br.read_2b).to eq(0) # file comment length
421
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
422
+ expect(br.read_2b).to eq(0) # internal file attributes
423
+ expect(br.read_4b).to eq(2175008768) # external file attributes
424
+ expect(br.read_4b).to eq(2147484207) # relative offset of local header
425
+ expect(br.read_n(19)).to eq('second-big-file.bin') # the filename
426
+
427
+ # zip64 specific values for a whole central directory
428
+ expect(br.read_4b).to eq(0x06064b50) # zip64 end of central dir signature
429
+ expect(br.read_8b).to eq(44) # size of zip64 end of central directory record
430
+ expect(br.read_2b).to eq(820) # version made by
431
+ expect(br.read_2b).to eq(45) # version need to extract
432
+ expect(br.read_4b).to eq(0) # number of this disk
433
+ expect(br.read_4b).to eq(0) # another number related to disk
434
+ expect(br.read_8b).to eq(2) # total number of entries in the central directory on this disk
435
+ expect(br.read_8b).to eq(2) # total number of entries in the central directory
436
+ expect(br.read_8b).to eq(129) # size of central directory
437
+ expect(br.read_8b).to eq(4294968927) # starting disk number
438
+ expect(br.read_4b).to eq(0x07064b50) # zip64 end of central dir locator signature
439
+ expect(br.read_4b).to eq(0) # number of disk ...
440
+ expect(br.read_8b).to eq(4294969056) # relative offset zip64
441
+ expect(br.read_4b).to eq(1) # total number of disks
442
+ end
443
+
444
+ it 'writes the central directory for 3 files which, the third of which will require the Zip64 extra since it is past the 4GB offset' do
445
+ zip = described_class.new
446
+ raw_buf = StringIO.new
447
+
448
+ zip_write_buf = IOWrapper.new(raw_buf)
449
+ big1 = 0xFFFFFFFF/2 + 512
450
+ big2 = 0xFFFFFFFF/2 + 1024
451
+ big3 = 0xFFFFFFFF/2 + 1024
452
+ mtime = Time.utc(2016, 7, 17, 13, 48)
453
+
454
+ zip.add_local_file_header(io: zip_write_buf, filename: 'one', crc32: 12345, compressed_size: big1,
455
+ uncompressed_size: big1, storage_mode: 0, mtime: mtime)
456
+ zip_write_buf.advance_position_by(big1)
457
+
458
+ zip.add_local_file_header(io: zip_write_buf, filename: 'two', crc32: 54321, compressed_size: big2,
459
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
460
+ zip_write_buf.advance_position_by(big2)
461
+
462
+ big3_offset = zip_write_buf.tell
463
+
464
+ zip.add_local_file_header(io: zip_write_buf, filename: 'three', crc32: 54321, compressed_size: big2,
465
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
466
+ zip_write_buf.advance_position_by(big3)
467
+
468
+ fake_central_dir_offset = zip_write_buf.tell # Grab the position in the underlying buffer
469
+ actual_central_dir_offset = raw_buf.tell # Grab the position in the underlying buffer
470
+
471
+ zip.write_central_directory(zip_write_buf)
472
+
473
+ # Seek to where the central directory begins
474
+ raw_buf.seek(actual_central_dir_offset, IO::SEEK_SET)
475
+
476
+ br = ByteReader.new(raw_buf)
477
+
478
+ # Standard central directory entry (similar to the local file header)
479
+ # Skip over two entries, because the other example has a 1-to-1 repeat of this
480
+ 2.times {
481
+ br.read_4b
482
+ br.read_2b
483
+ br.read_2b
484
+ br.read_2b
485
+ br.read_2b
486
+ br.read_2b
487
+ br.read_2b
488
+ br.read_4b
489
+ br.read_4b
490
+ br.read_4b
491
+ br.read_2b
492
+ br.read_2b
493
+ br.read_2b
494
+ br.read_2b
495
+ br.read_2b
496
+ br.read_4b
497
+ br.read_4b
498
+ br.read_n(3)
499
+ }
500
+
501
+ # Entry for the third file DOES bear the Zip64 extra field
502
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
503
+ expect(br.read_2b).to eq(820) # version made by
504
+ expect(br.read_2b).to eq(45) # version need to extract (45 for Zip64) - this entry requires it
505
+ expect(br.read_2b).to eq(0) # general purpose bit flag
506
+ expect(br.read_2b).to eq(0) # compression method (stored here)
507
+ expect(br.read_2b).to eq(28160) # last mod file time
508
+ expect(br.read_2b).to eq(18673) # last mod file date
509
+ expect(br.read_4b).to eq(54321) # crc32
510
+ expect(br.read_4b).to eq(0xFFFFFFFF) # compressed size - blanked for Zip64
511
+ expect(br.read_4b).to eq(0xFFFFFFFF) # uncompressed size - blanked for Zip64
512
+ expect(br.read_2b).to eq(5) # filename length
513
+ expect(br.read_2b).to eq(32) # extra field length (length of the ZIp64 extra)
514
+ expect(br.read_2b).to eq(0) # file comment length
515
+ expect(br.read_2b).to eq(0xFFFF) # disk number, with Zip64 must be blanked to the maximum value because of The Unarchiver bug
516
+ expect(br.read_2b).to eq(0) # internal file attributes
517
+ expect(br.read_4b).to eq(2175008768) # external file attributes
518
+ expect(br.read_4b).to eq(4294967295) # relative offset of local header
519
+ expect(br.read_n(5)).to eq('three') # the filename
520
+ # then the Zip64 extra for that last file _only_
521
+ expect(br.read_2b).to eq(0x0001) # Tag for the "extra" block
522
+ expect(br.read_2b).to eq(28) # Size of this "extra" block. For us it will always be 28
523
+ expect(br.read_8b).to eq(big3) # Original uncompressed file size
524
+ expect(br.read_8b).to eq(big3) # Original compressed file size
525
+ expect(br.read_8b).to eq(big3_offset) # Offset of local header record
526
+ expect(br.read_4b).to eq(0) # Number of the disk on which this file starts
527
+
528
+ # zip64 specific values for a whole central directory
529
+ expect(br.read_4b).to eq(0x06064b50) # zip64 end of central dir signature
530
+ expect(br.read_8b).to eq(44) # size of zip64 end of central directory record
531
+ expect(br.read_2b).to eq(820) # version made by
532
+ expect(br.read_2b).to eq(45) # version need to extract
533
+ expect(br.read_4b).to eq(0) # number of this disk
534
+ expect(br.read_4b).to eq(0) # another number related to disk
535
+ expect(br.read_8b).to eq(3) # total number of entries in the central directory on this disk
536
+ expect(br.read_8b).to eq(3) # total number of entries in the central directory
537
+ expect(br.read_8b).to eq(181) # size of central directory
538
+ expect(br.read_8b).to eq(6442453602) # central directory offset from start of disk
229
539
 
230
- skip "Not finished"
540
+ expect(br.read_4b).to eq(0x07064b50) # Zip64 EOCD locator signature
541
+ expect(br.read_4b).to eq(0) # Disk number with the start of central directory
542
+ expect(br.read_8b).to eq(6442453783) # relative offset of the zip64 end of central directory record
543
+ expect(br.read_4b).to eq(1) # total number of disks
231
544
  end
232
-
233
- it 'writes the central directory 1 file that is larger than 4GB'
234
- it 'writes the central directory for 2 files which, together, make the central directory start beyound the 4GB threshold'
235
545
  end
236
546
  end
data/zip_tricks.gemspec CHANGED
@@ -2,16 +2,16 @@
2
2
  # DO NOT EDIT THIS FILE DIRECTLY
3
3
  # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
4
  # -*- encoding: utf-8 -*-
5
- # stub: zip_tricks 2.8.0 ruby lib
5
+ # stub: zip_tricks 2.8.1 ruby lib
6
6
 
7
7
  Gem::Specification.new do |s|
8
8
  s.name = "zip_tricks"
9
- s.version = "2.8.0"
9
+ s.version = "2.8.1"
10
10
 
11
11
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
12
12
  s.require_paths = ["lib"]
13
13
  s.authors = ["Julik Tarkhanov"]
14
- s.date = "2016-07-18"
14
+ s.date = "2016-07-22"
15
15
  s.description = "Makes rubyzip stream, for real"
16
16
  s.email = "me@julik.nl"
17
17
  s.extra_rdoc_files = [
@@ -24,6 +24,7 @@ Gem::Specification.new do |s|
24
24
  ".travis.yml",
25
25
  ".yardopts",
26
26
  "Gemfile",
27
+ "IMPLEMENTATION_DETAILS.md",
27
28
  "LICENSE.txt",
28
29
  "README.md",
29
30
  "Rakefile",
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: zip_tricks
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.8.0
4
+ version: 2.8.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Julik Tarkhanov
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-07-18 00:00:00.000000000 Z
11
+ date: 2016-07-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -161,6 +161,7 @@ files:
161
161
  - ".travis.yml"
162
162
  - ".yardopts"
163
163
  - Gemfile
164
+ - IMPLEMENTATION_DETAILS.md
164
165
  - LICENSE.txt
165
166
  - README.md
166
167
  - Rakefile