zip_tricks 2.8.0 → 2.8.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 84302e70bc873418b8a458456a75e0da7b5bc67d
4
- data.tar.gz: 06f81fc860d5cf77fdbd4b570602ae4984297d43
3
+ metadata.gz: 05fb68904433a8f06dd50d668c29d3c0cf8780f2
4
+ data.tar.gz: af0dfe83697cc9cd6bb4ed1a8714cffbb3ceedd7
5
5
  SHA512:
6
- metadata.gz: 932fe6e3a095f43996505c642fce11009efba4298302e958f5fdf7649cd3ca9fd28ab1de6a33d1e20be2955a7efd58b4d432486d3004ab70437a55413c27934d
7
- data.tar.gz: b9d43dbf16156cd3c877a8353dd0dd4a34e4c8b2e43664e34c4c3f2562c3a9937390bf398b0278449bf338f89337c348f30da637414e61b17eab58a939b05fc5
6
+ metadata.gz: 4b583b966eb87b502f428bd43d7bd9d9c800112bb3c256aa63a8cda6dcfb02052cf653de2df4022e23f49db37e157d81d336b0db8c8527c88f6e2de2e8974172
7
+ data.tar.gz: 22f263d209256d28c552c6fe04b9981d1181eb7f062233f7bc407146987faf2d1f14db9b8902ddf89d4038680839971291c5e47d8b57cbc8ed248a9360bf431b
@@ -0,0 +1,106 @@
1
+ # Implementation details
2
+
3
+ The ZipTricks streaming implementation is designed around the following requirements:
4
+
5
+ * Only ahead-writes (no IO seek or rewind)
6
+ * Automatic switching to Zip64 as the files get written (no IO seeks), but not requiring Zip64 support if the archive can do without
7
+ * Make use of the fact that CRC32 checksums and the sizes of the files (compressed _and_ uncompressed) are known upfront
8
+
9
+ It strives to be compatible with the following unzip programs _at the minimum:_
10
+
11
+ * OSX - builtin ArchiveUtility (except the Zip64 support when files larger than 4GB are in the archive)
12
+ * OSX - The Unarchiver, at least 3.10.1
13
+ * Windows 7 - built-in Explorer zip browser (except for Unicode filenames which it just doesn't support)
14
+ * Windows 7 - 7Zip 9.20
15
+
16
+ Below is the list of _specific_ decisions taken when writing the implementation, with an explanation for each.
17
+ We specifically _omit_ a number of things that we could do, but that are not necessary to satisfy our objectives.
18
+ The omissions are _intentional_ since we do not want to have things of which we _assume_ they work, or have things
19
+ that work only for one obscure unarchiver in one obscure case (like WinRAR with chinese filenames).
20
+
21
+ ## Data descriptors (postfix CRC32/file sizes)
22
+
23
+ Data descriptors permit you to generate "postfix" ZIP files (where you write the local file header without having to
24
+ know the CRC32 and the file size upfront, then write the compressed file data, and only then - once you know what your CRC32,
25
+ compressed and uncompressed sizes are etc. - write them into a data descriptor that follows the file data.
26
+
27
+ The streamer does _not_ use data descriptors, because their use [is problematic](https://github.com/thejoshwolfe/yazl/issues/13)
28
+ with the 7Zip version that we want to support. Or rather - not the use of data descriptors themselves, but the use of the GP flag
29
+ bit 3 that trips up that version of 7Zip. If we were to use data descriptors, we would have to up the minimum supported version
30
+ of 7Zip.
31
+
32
+ That means, in turn, that **to use the ZipTricks streamer you have to know the CRC32 and the sizes of the compressed/uncompressed
33
+ file upfront.** So you have to precompute them in some way. To do that, you can use `BlockDeflate` to precompress the file in
34
+ parallel, and `StreamCRC32` to compute the CRC checksum, before feeding them to the ZIP writer.
35
+
36
+ This approach might be reconsidered in the future.
37
+
38
+ For more info see https://github.com/thejoshwolfe/yazl#general-purpose-bit-flag
39
+
40
+ ## Zip64 support
41
+
42
+ Zip64 support switches on _by itself_, automatically, when _any_ of the following conditions is met:
43
+
44
+ * The start of the central directory lies beyound the 4GB limit
45
+ * The ZIP archive has more than 65535 files added to it
46
+ * Any entry is present whose compressed _or_ uncompressed size is above 4GB
47
+
48
+ When writing out local file headers, the Zip64 extra field (and related changes to the standard fields) are
49
+ _only_ performed if one of the file sizes is larger than 4GB. Otherwise the Zip64 extra will _only_ be
50
+ written in the central directory entry, but not in the local file header.
51
+
52
+ This has to do with the fact that otherwise we would write Zip64 extra fields for all local file headers,
53
+ regardless whether the file actually requires Zip64 or not. That might impede some older tools from reading
54
+ the archive, which is a problem you don't want to have if your archive otherwise fits perfectly below all
55
+ the Zip64 thresholds.
56
+
57
+ To be compatible with Windows7 built-in tools, the Zip64 extra field _must_ be written as _the first_ extra
58
+ field, any other extra fields should come after.
59
+
60
+ ## International filename support and the Info-ZIP extra field
61
+
62
+ If a diacritic-containing character (such as å) does fit into the DOS-437
63
+ codepage, it should be encodable as such. This would, in theory, let older Windows tools
64
+ decode the filename correctly. However, this kills the filename decoding for the OSX builtin
65
+ archive utility (it assumes the filename to be UTF-8, regardless). So if we allow filenames
66
+ to be encoded in DOS-437, we _potentially_ have support in Windows but we upset everyone on Mac.
67
+ If we just use UTF-8 and set the right EFS bit in general purpose flags, we upset Windows users
68
+ because most of the Windows unarchive tools (at least the builtin ones) do not give a flying eff
69
+ about the EFS support bit being set.
70
+
71
+ Additionally, if we use Unarchiver on OSX (which is our recommended unpacker for large files),
72
+ it will (very rightfully) ask us how we should decode each filename that does not have the EFS bit,
73
+ but does contain something non-ASCII-decodable. This is horrible UX for users.
74
+
75
+ So, basically, we have 2 choices, for filenames containing diacritics (for bona-fide UTF-8 you do not
76
+ even get those choices, you _have_ to use UTF-8):
77
+
78
+ * Make life easier for Windows users by setting stuff to DOS, not care about the standard _and_ make
79
+ most of Mac users upset
80
+ * Make life easy for Mac users and conform to the standard, and tell Windows users to get a _decent_
81
+ ZIP unarchiving tool.
82
+
83
+ We are going with option 2, and this is well-thought-out. Trust me. If you want the crazytown
84
+ filename encoding scheme that is described here http://stackoverflow.com/questions/13261347
85
+ you can try this:
86
+
87
+ [Encoding::CP437, Encoding::ISO_8859_1, Encoding::UTF_8]
88
+
89
+ We don't want no such thing, and sorry Windows users, you are going to need a decent unarchiver
90
+ that honors the standard. Alas, alas.
91
+
92
+ Additionally, the tests with the unarchivers we _do_ support have shown that including the InfoZIP
93
+ extra field does not actually help any of them recognize the file name correctly. And the use of
94
+ those fields for the UTF-8 filename, per spec, tells us we should not set the EFS bit - which ruins
95
+ the unarchiving for all other solutions. As any other, this decision may be changed in the future.
96
+
97
+ There are some interesting notes about the Info-ZIP/EFS combination here
98
+ https://commons.apache.org/proper/commons-compress/zip.html
99
+
100
+ ## Directory support
101
+
102
+ ZIP makes it possible to store empty directories (folders). For our purposes, however, we are going
103
+ to store only the files. If you store a file, called, say, `docs/item.doc` then the unarchiver will
104
+ automatically create the `docs` directory if it doesn't exist already. You can also store an entry
105
+ with a length of 0 and set it's external attributes to be an empty directory, but we do not need
106
+ that functionality - so it is also omitted.
@@ -13,6 +13,7 @@ class ZipTricks::Microzip
13
13
  DEFLATED = 8
14
14
 
15
15
  TooMuch = Class.new(StandardError)
16
+ PathError = Class.new(StandardError)
16
17
  DuplicateFilenames = Class.new(StandardError)
17
18
  UnknownMode = Class.new(StandardError)
18
19
 
@@ -42,21 +43,14 @@ class ZipTricks::Microzip
42
43
  C_v = 'v'.freeze
43
44
  C_Qe = 'Q<'.freeze
44
45
 
45
- module Bytesize
46
- def bytesize_of
47
- ''.force_encoding(Encoding::BINARY).tap {|b| yield(b) }.bytesize
48
- end
49
- end
50
- include Bytesize
51
-
52
46
  class Entry < Struct.new(:filename, :crc32, :compressed_size, :uncompressed_size, :storage_mode, :mtime)
53
- include Bytesize
54
47
  def initialize(*)
55
48
  super
49
+ filename.force_encoding(Encoding::UTF_8)
50
+ @requires_efs_flag = !(filename.encode(Encoding::ASCII) rescue false)
56
51
  @requires_zip64 = (compressed_size > FOUR_BYTE_MAX_UINT || uncompressed_size > FOUR_BYTE_MAX_UINT)
57
- if filename.bytesize > TWO_BYTE_MAX_UINT
58
- raise TooMuch, "The given filename is too long to fit (%d bytes)" % filename.bytesize
59
- end
52
+ raise TooMuch, "Filename is too long" if filename.bytesize > TWO_BYTE_MAX_UINT
53
+ raise PathError, "Paths in ZIP may only contain forward slashes (UNIX separators)" if filename.include?('\\')
60
54
  end
61
55
 
62
56
  def requires_zip64?
@@ -67,41 +61,8 @@ class ZipTricks::Microzip
67
61
  # bit (bit 11) which should be set if the filename is UTF8. If it is, we need to set the
68
62
  # bit so that the unarchiving application knows that the filename in the archive is UTF-8
69
63
  # encoded, and not some DOS default. For ASCII entries it does not matter.
70
- #
71
- # Now, strictly speaking, if a diacritic-containing character (such as å) does fit into the DOS-437
72
- # codepage, it should be encodable as such. This would, in theory, let older Windows tools
73
- # decode the filename correctly. However, this kills the filename decoding for the OSX builtin
74
- # archive utility (it assumes the filename to be UTF-8, regardless). So if we allow filenames
75
- # to be encoded in DOS-437, we _potentially_ have support in Windows but we upset everyone on Mac.
76
- # If we just use UTF-8 and set the right EFS bit in general purpose flags, we upset Windows users
77
- # because most of the Windows unarchive tools (at least the builtin ones) do not give a flying eff
78
- # about the EFS support bit being set.
79
- #
80
- # Additionally, if we use Unarchiver on OSX (which is our recommended unpacker for large files),
81
- # it will (very rightfully) ask us how we should decode each filename that does not have the EFS bit,
82
- # but does contain something non-ASCII-decodable. This is horrible UX for users.
83
- #
84
- # So, basically, we have 2 choices, for filenames containing diacritics (for bona-fide UTF-8 you do not
85
- # even get those choices, you _have_ to use UTF-8):
86
- #
87
- # * Make life easier for Windows users by setting stuff to DOS, not care about the standard _and_ make
88
- # most of Mac users upset
89
- # * Make life easy for Mac users and conform to the standard, and tell Windows users to get a _decent_
90
- # ZIP unarchiving tool.
91
- #
92
- # We are going with option 2, and this is well-thought-out. Trust me. If you want the crazytown
93
- # filename encoding scheme that is described here http://stackoverflow.com/questions/13261347
94
- # you can try this:
95
- #
96
- # [Encoding::CP437, Encoding::ISO_8859_1, Encoding::UTF_8]
97
- #
98
- # We don't want no such thing, and sorry Windows users, you are going to need a decent unarchiver
99
- # that honors the standard. Alas, alas.
100
64
  def gp_flags_based_on_filename
101
- filename.encode(Encoding::ASCII)
102
- 0b00000000000
103
- rescue EncodingError
104
- 0b00000000000 | 0b100000000000
65
+ @requires_efs_flag ? (0b00000000000 | 0b100000000000) : 0b00000000000
105
66
  end
106
67
 
107
68
  def write_local_file_header(io)
@@ -212,9 +173,16 @@ class ZipTricks::Microzip
212
173
  io << [extra_size].pack(C_v) # extra field length 2 bytes
213
174
 
214
175
  io << [0].pack(C_v) # file comment length 2 bytes
215
- io << [0].pack(C_v) # disk number start 2 bytes
216
- io << [0].pack(C_v) # internal file attributes 2 bytes
217
176
 
177
+ # For The Unarchiver < 3.11.1 this field has to be set to the overflow value if zip64 is used
178
+ # because otherwise it does not properly advance the pointer when reading the Zip64 extra field
179
+ # https://bitbucket.org/WAHa_06x36/theunarchiver/pull-requests/2/bug-fix-for-zip64-extra-field-parser/diff
180
+ if @requires_zip64
181
+ io << [TWO_BYTE_MAX_UINT].pack(C_v) # disk number start 2 bytes
182
+ else
183
+ io << [0].pack(C_v) # disk number start 2 bytes
184
+ end
185
+ io << [0].pack(C_v) # internal file attributes 2 bytes
218
186
  io << [DEFAULT_EXTERNAL_ATTRS].pack(C_V) # external file attributes 4 bytes
219
187
 
220
188
  if @requires_zip64
@@ -232,6 +200,10 @@ class ZipTricks::Microzip
232
200
 
233
201
  private
234
202
 
203
+ def bytesize_of
204
+ ''.force_encoding(Encoding::BINARY).tap {|b| yield(b) }.bytesize
205
+ end
206
+
235
207
  def to_binary_dos_time(t)
236
208
  (t.sec/2) + (t.min << 5) + (t.hour << 11)
237
209
  end
@@ -313,10 +285,10 @@ class ZipTricks::Microzip
313
285
  # offset of start of central
314
286
  # directory with respect to
315
287
  io << [start_of_central_directory].pack(C_Qe) # the starting disk number 8 bytes
316
- # zip64 extensible data sector (variable size)
288
+ # zip64 extensible data sector (variable size), blank for us
317
289
 
318
290
  # [zip64 end of central directory locator]
319
- io << [0x07064b50].pack("V") # zip64 end of central dir locator
291
+ io << [0x07064b50].pack(C_V) # zip64 end of central dir locator
320
292
  # signature 4 bytes (0x07064b50)
321
293
  io << [0].pack(C_V) # number of the disk with the
322
294
  # start of the zip64 end of
data/lib/zip_tricks.rb CHANGED
@@ -2,7 +2,7 @@ require 'zip'
2
2
  require 'very_tiny_state_machine'
3
3
 
4
4
  module ZipTricks
5
- VERSION = '2.8.0'
5
+ VERSION = '2.8.1'
6
6
 
7
7
  # Require all the sub-components except myself
8
8
  Dir.glob(__dir__ + '/**/*.rb').sort.each {|p| require p unless p == __FILE__ }
@@ -1,28 +1,35 @@
1
1
  require_relative '../spec_helper'
2
+ require_relative '../../testing/support'
2
3
 
3
4
  describe ZipTricks::Microzip do
4
5
  class ByteReader < Struct.new(:io)
5
6
  def read_2b
6
7
  io.read(2).unpack('v').first
7
8
  end
8
-
9
+
9
10
  def read_2c
10
11
  io.read(2).unpack('CC').first
11
12
  end
12
-
13
+
13
14
  def read_4b
14
15
  io.read(4).unpack('V').first
15
16
  end
16
-
17
+
17
18
  def read_8b
18
19
  io.read(8).unpack('Q<').first
19
20
  end
20
-
21
+
21
22
  def read_n(n)
22
23
  io.read(n)
23
24
  end
24
25
  end
25
-
26
+
27
+ class IOWrapper < ZipTricks::WriteAndTell
28
+ def read(n)
29
+ @io.read(n)
30
+ end
31
+ end
32
+
26
33
  it 'raises an exception if the filename is non-unique in the already existing set' do
27
34
  z = described_class.new
28
35
  z.add_local_file_header(io: StringIO.new, filename: 'foo.txt', crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
@@ -31,15 +38,23 @@ describe ZipTricks::Microzip do
31
38
  }.to raise_error(/already/)
32
39
  end
33
40
 
41
+ it 'raises an exception if the filename contains backward slashes' do
42
+ z = described_class.new
43
+ expect {
44
+ z.add_local_file_header(io: StringIO.new, filename: 'windows\not\welcome.txt',
45
+ crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
46
+ }.to raise_error(/UNIX/)
47
+ end
48
+
34
49
  it 'raises an exception if the filename does not fit in 0xFFFF bytes' do
35
50
  longest_filename_in_the_universe = "x" * (0xFFFF + 1)
36
51
  z = described_class.new
37
52
  expect {
38
- z.add_local_file_header(io: StringIO.new, filename: longest_filename_in_the_universe,
53
+ z.add_local_file_header(io: StringIO.new, filename: longest_filename_in_the_universe,
39
54
  crc32: 0, compressed_size: 0, uncompressed_size: 0, storage_mode: 0)
40
- }.to raise_error(/filename/)
55
+ }.to raise_error(/is too long/)
41
56
  end
42
-
57
+
43
58
  describe '#add_local_file_header' do
44
59
  it 'writes out the local file header for an entry that fits into a standard ZIP' do
45
60
  buf = StringIO.new
@@ -47,7 +62,7 @@ describe ZipTricks::Microzip do
47
62
  mtime = Time.utc(2016, 7, 17, 13, 48)
48
63
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 8981,
49
64
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
50
-
65
+
51
66
  buf.rewind
52
67
  br = ByteReader.new(buf)
53
68
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -64,28 +79,45 @@ describe ZipTricks::Microzip do
64
79
  expect(br.read_n('first-file.bin'.bytesize)).to eq('first-file.bin') # the filename
65
80
  expect(buf).to be_eof
66
81
  end
67
-
82
+
68
83
  it 'writes out the local file header for an entry with a UTF-8 filename, setting the proper GP flag bit' do
69
84
  buf = StringIO.new
70
85
  zip = described_class.new
71
86
  mtime = Time.utc(2016, 7, 17, 13, 48)
72
87
  zip.add_local_file_header(io: buf, filename: 'файл.bin', crc32: 123, compressed_size: 8981,
73
88
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
74
-
89
+
90
+ buf.rewind
91
+ br = ByteReader.new(buf)
92
+ br.read_4b # Signature
93
+ br.read_2b # Version needed to extract
94
+ expect(br.read_2b).to eq(2048) # gp flags
95
+ end
96
+
97
+ it "correctly recognizes UTF-8 filenames even if they are tagged as ASCII" do
98
+ name = 'файл.bin'
99
+ name.force_encoding(Encoding::US_ASCII)
100
+
101
+ buf = StringIO.new
102
+ zip = described_class.new
103
+ mtime = Time.utc(2016, 7, 17, 13, 48)
104
+ zip.add_local_file_header(io: buf, filename: name, crc32: 123, compressed_size: 8981,
105
+ uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
106
+
75
107
  buf.rewind
76
108
  br = ByteReader.new(buf)
77
109
  br.read_4b # Signature
78
110
  br.read_2b # Version needed to extract
79
111
  expect(br.read_2b).to eq(2048) # gp flags
80
112
  end
81
-
113
+
82
114
  it 'writes out the local file header for an entry with a filename with diacritics, setting the proper GP flag bit' do
83
115
  buf = StringIO.new
84
116
  zip = described_class.new
85
117
  mtime = Time.utc(2016, 7, 17, 13, 48)
86
118
  zip.add_local_file_header(io: buf, filename: 'Kungälv', crc32: 123, compressed_size: 8981,
87
119
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
88
-
120
+
89
121
  buf.rewind
90
122
  br = ByteReader.new(buf)
91
123
  br.read_4b # Signature
@@ -102,14 +134,14 @@ describe ZipTricks::Microzip do
102
134
  filename_readback = br.read_n('Kungälv'.bytesize)
103
135
  expect(filename_readback.force_encoding(Encoding::UTF_8)).to eq('Kungälv')
104
136
  end
105
-
137
+
106
138
  it 'writes out the local file header for an entry that requires Zip64 based on its compressed size _only_' do
107
139
  buf = StringIO.new
108
140
  zip = described_class.new
109
141
  mtime = Time.utc(2016, 7, 17, 13, 48)
110
142
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: (0xFFFFFFFF + 1),
111
143
  uncompressed_size: 90981, storage_mode: 8, mtime: mtime)
112
-
144
+
113
145
  buf.rewind
114
146
  br = ByteReader.new(buf)
115
147
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -130,14 +162,14 @@ describe ZipTricks::Microzip do
130
162
  expect(br.read_8b).to eq(0xFFFFFFFF + 1) # True uncompressed size
131
163
  expect(buf).to be_eof
132
164
  end
133
-
165
+
134
166
  it 'writes out the local file header for an entry that requires Zip64 based on its uncompressed size _only_' do
135
167
  buf = StringIO.new
136
168
  zip = described_class.new
137
169
  mtime = Time.utc(2016, 7, 17, 13, 48)
138
170
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 90981,
139
171
  uncompressed_size: (0xFFFFFFFF + 1), storage_mode: 8, mtime: mtime)
140
-
172
+
141
173
  buf.rewind
142
174
  br = ByteReader.new(buf)
143
175
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -158,7 +190,7 @@ describe ZipTricks::Microzip do
158
190
  expect(br.read_8b).to eq(90981) # True compressed size
159
191
  expect(buf).to be_eof
160
192
  end
161
-
193
+
162
194
  it 'does not write out the Zip64 extra if the position in the destination IO is beyond the Zip64 size limit' do
163
195
  buf = StringIO.new
164
196
  zip = described_class.new
@@ -166,7 +198,7 @@ describe ZipTricks::Microzip do
166
198
  expect(buf).to receive(:tell).and_return(0xFFFFFFFF + 1)
167
199
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 123,
168
200
  uncompressed_size: 456, storage_mode: 8, mtime: mtime)
169
-
201
+
170
202
  buf.rewind
171
203
  br = ByteReader.new(buf)
172
204
  expect(br.read_4b).to eq(0x04034b50) # Signature
@@ -182,14 +214,14 @@ describe ZipTricks::Microzip do
182
214
  expect(br.read_2b).to be_zero
183
215
  end
184
216
  end
185
-
217
+
186
218
  describe '#write_central_directory' do
187
- it 'can write the central directory and makes it a valid one even if there were no files' do
219
+ it 'writes the central directory and makes it a valid one even if there were no files' do
188
220
  buf = StringIO.new
189
-
221
+
190
222
  zip = described_class.new
191
223
  zip.write_central_directory(buf)
192
-
224
+
193
225
  buf.rewind
194
226
  br = ByteReader.new(buf)
195
227
  expect(br.read_4b).to eq(0x06054b50) # EOCD signature
@@ -202,35 +234,313 @@ describe ZipTricks::Microzip do
202
234
  expect(br.read_2b).to eq(0) # ZIP file comment length
203
235
  expect(buf).to be_eof
204
236
  end
205
-
237
+
206
238
  it 'writes the central directory for 2 files' do
207
239
  zip = described_class.new
208
-
240
+
209
241
  mtime = Time.utc(2016, 7, 17, 13, 48)
210
-
242
+
211
243
  buf = StringIO.new
212
244
  zip.add_local_file_header(io: buf, filename: 'first-file.bin', crc32: 123, compressed_size: 5,
213
245
  uncompressed_size: 8, storage_mode: 8, mtime: mtime)
214
246
  buf << Random.new.bytes(5)
215
- zip.add_local_file_header(io: buf, filename: 'first-file.txt', crc32: 123, compressed_size: 9,
247
+ zip.add_local_file_header(io: buf, filename: 'second-file.txt', crc32: 546, compressed_size: 9,
216
248
  uncompressed_size: 9, storage_mode: 0, mtime: mtime)
217
249
  buf << Random.new.bytes(5)
218
-
250
+
219
251
  central_dir_offset = buf.tell
220
-
221
252
  zip.write_central_directory(buf)
222
-
253
+
223
254
  # Seek to where the central directory begins
224
255
  buf.rewind
225
256
  buf.seek(central_dir_offset)
226
-
257
+
227
258
  br = ByteReader.new(buf)
259
+
260
+ # Central directory entry for the first file
261
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
262
+ expect(br.read_2b).to eq(820) # version made by
263
+ expect(br.read_2b).to eq(20) # version need to extract
264
+ expect(br.read_2b).to eq(0) # general purpose bit flag
265
+ expect(br.read_2b).to eq(8) # compression method (deflated here)
266
+ expect(br.read_2b).to eq(28160) # last mod file time
267
+ expect(br.read_2b).to eq(18673) # last mod file date
268
+ expect(br.read_4b).to eq(123) # crc32
269
+ expect(br.read_4b).to eq(5) # compressed size
270
+ expect(br.read_4b).to eq(8) # uncompressed size
271
+ expect(br.read_2b).to eq(14) # filename length
272
+ expect(br.read_2b).to eq(0) # extra field length
273
+ expect(br.read_2b).to eq(0) # file comment
274
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
275
+ expect(br.read_2b).to eq(0) # internal file attributes
276
+ expect(br.read_4b).to eq(2175008768) # external file attributes
277
+ expect(br.read_4b).to eq(0) # relative offset of local header
278
+ expect(br.read_n(14)).to eq('first-file.bin') # the filename
279
+
280
+ # Central directory entry for the second file
228
281
  expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
282
+ expect(br.read_2b).to eq(820) # version made by
283
+ expect(br.read_2b).to eq(20) # version need to extract
284
+ expect(br.read_2b).to eq(0) # general purpose bit flag
285
+ expect(br.read_2b).to eq(0) # compression method (stored here)
286
+ expect(br.read_2b).to eq(28160) # last mod file time
287
+ expect(br.read_2b).to eq(18673) # last mod file date
288
+ expect(br.read_4b).to eq(546) # crc32
289
+ expect(br.read_4b).to eq(9) # compressed size
290
+ expect(br.read_4b).to eq(9) # uncompressed size
291
+ expect(br.read_2b).to eq('second-file.bin'.bytesize) # filename length
292
+ expect(br.read_2b).to eq(0) # extra field length
293
+ expect(br.read_2b).to eq(0) # file comment
294
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
295
+ expect(br.read_2b).to eq(0) # internal file attributes
296
+ expect(br.read_4b).to eq(2175008768) # external file attributes
297
+ expect(br.read_4b).to eq(49) # relative offset of local header
298
+ expect(br.read_n('second-file.txt'.bytesize)).to eq('second-file.txt') # the filename
299
+
300
+ expect(br.read_4b).to eq(0x06054b50) # end of central dir signature
301
+ br.read_2b
302
+ br.read_2b
303
+ br.read_2b
304
+ br.read_2b
305
+ br.read_4b
306
+ br.read_4b
307
+ br.read_2b
308
+
309
+ expect(buf).to be_eof
310
+ end
311
+
312
+ it 'writes the central directory for 1 file that is larger than 4GB' do
313
+ zip = described_class.new
314
+ buf = StringIO.new
315
+ big = 0xFFFFFFFF + 2048
316
+ mtime = Time.utc(2016, 7, 17, 13, 48)
317
+
318
+ zip.add_local_file_header(io: buf, filename: 'big-file.bin', crc32: 12345, compressed_size: big,
319
+ uncompressed_size: big, storage_mode: 0, mtime: mtime)
320
+
321
+ central_dir_offset = buf.tell
322
+
323
+ zip.write_central_directory(buf)
324
+
325
+ # Seek to where the central directory begins
326
+ buf.rewind
327
+ buf.seek(central_dir_offset)
328
+
329
+ br = ByteReader.new(buf)
330
+
331
+ # Standard central directory entry (similar to the local file header)
332
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
333
+ expect(br.read_2b).to eq(820) # version made by
334
+ expect(br.read_2b).to eq(45) # version need to extract (45 for Zip64)
335
+ expect(br.read_2b).to eq(0) # general purpose bit flag
336
+ expect(br.read_2b).to eq(0) # compression method (stored here)
337
+ expect(br.read_2b).to eq(28160) # last mod file time
338
+ expect(br.read_2b).to eq(18673) # last mod file date
339
+ expect(br.read_4b).to eq(12345) # crc32
340
+ expect(br.read_4b).to eq(0xFFFFFFFF) # compressed size
341
+ expect(br.read_4b).to eq(0xFFFFFFFF) # uncompressed size
342
+ expect(br.read_2b).to eq(12) # filename length
343
+ expect(br.read_2b).to eq(32) # extra field length (we store the Zip64 extra field for this file)
344
+ expect(br.read_2b).to eq(0) # file comment
345
+ expect(br.read_2b).to eq(0xFFFF) # disk number, must be blanked to the maximum value because of The Unarchiver bug
346
+ expect(br.read_2b).to eq(0) # internal file attributes
347
+ expect(br.read_4b).to eq(2175008768) # external file attributes
348
+ expect(br.read_4b).to eq(0xFFFFFFFF) # relative offset of local header
349
+ expect(br.read_n(12)).to eq('big-file.bin') # the filename
350
+
351
+ # Zip64 extra field
352
+ expect(br.read_2b).to eq(0x0001) # Tag for the "extra" block
353
+ expect(br.read_2b).to eq(28) # Size of this "extra" block. For us it will always be 28
354
+ expect(br.read_8b).to eq(big) # Original uncompressed file size
355
+ expect(br.read_8b).to eq(big) # Original compressed file size
356
+ expect(br.read_8b).to eq(0) # Offset of local header record
357
+ expect(br.read_4b).to eq(0) # Number of the disk on which this file starts
358
+ end
359
+
360
+ it 'writes the central directory for 2 files which, together, make the central directory start beyound the 4GB threshold' do
361
+ zip = described_class.new
362
+ raw_buf = StringIO.new
363
+
364
+ zip_write_buf = IOWrapper.new(raw_buf)
365
+ big1 = 0xFFFFFFFF/2 + 512
366
+ big2 = 0xFFFFFFFF/2 + 1024
367
+ mtime = Time.utc(2016, 7, 17, 13, 48)
368
+
369
+ zip.add_local_file_header(io: zip_write_buf, filename: 'first-big-file.bin', crc32: 12345, compressed_size: big1,
370
+ uncompressed_size: big1, storage_mode: 0, mtime: mtime)
371
+ zip_write_buf.advance_position_by(big1)
372
+
373
+ zip.add_local_file_header(io: zip_write_buf, filename: 'second-big-file.bin', crc32: 54321, compressed_size: big2,
374
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
375
+ zip_write_buf.advance_position_by(big2)
376
+
377
+ fake_central_dir_offset = zip_write_buf.tell # Grab the position in the underlying buffer
378
+ actual_central_dir_offset = raw_buf.tell # Grab the position in the underlying buffer
379
+
380
+ zip.write_central_directory(zip_write_buf)
381
+
382
+ # Seek to where the central directory begins
383
+ raw_buf.seek(actual_central_dir_offset, IO::SEEK_SET)
384
+
385
+ br = ByteReader.new(raw_buf)
386
+
387
+ # Standard central directory entry (similar to the local file header)
388
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
389
+ expect(br.read_2b).to eq(820) # version made by
390
+ expect(br.read_2b).to eq(20) # version need to extract (45 for Zip64)
391
+ expect(br.read_2b).to eq(0) # general purpose bit flag
392
+ expect(br.read_2b).to eq(0) # compression method (stored here)
393
+ expect(br.read_2b).to eq(28160) # last mod file time
394
+ expect(br.read_2b).to eq(18673) # last mod file date
395
+ expect(br.read_4b).to eq(12345) # crc32
396
+ expect(br.read_4b).to eq(2147484159) # compressed size
397
+ expect(br.read_4b).to eq(2147484159) # uncompressed size
398
+ expect(br.read_2b).to eq(18) # filename length
399
+ expect(br.read_2b).to eq(0) # extra field length
400
+ expect(br.read_2b).to eq(0) # file comment length
401
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
402
+ expect(br.read_2b).to eq(0) # internal file attributes
403
+ expect(br.read_4b).to eq(2175008768) # external file attributes
404
+ expect(br.read_4b).to eq(0) # relative offset of local header
405
+ expect(br.read_n(18)).to eq("first-big-file.bin") # the filename
406
+
407
+ # Standard central directory entry (similar to the local file header)
408
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
409
+ expect(br.read_2b).to eq(820) # version made by
410
+ expect(br.read_2b).to eq(20) # version need to extract (45 for Zip64)
411
+ expect(br.read_2b).to eq(0) # general purpose bit flag
412
+ expect(br.read_2b).to eq(0) # compression method (stored here)
413
+ expect(br.read_2b).to eq(28160) # last mod file time
414
+ expect(br.read_2b).to eq(18673) # last mod file date
415
+ expect(br.read_4b).to eq(54321) # crc32
416
+ expect(br.read_4b).to eq(2147484671) # compressed size
417
+ expect(br.read_4b).to eq(2147484671) # uncompressed size
418
+ expect(br.read_2b).to eq(19) # filename length
419
+ expect(br.read_2b).to eq(0) # extra field length
420
+ expect(br.read_2b).to eq(0) # file comment length
421
+ expect(br.read_2b).to eq(0) # disk number, must be blanked to the maximum value because of The Unarchiver bug
422
+ expect(br.read_2b).to eq(0) # internal file attributes
423
+ expect(br.read_4b).to eq(2175008768) # external file attributes
424
+ expect(br.read_4b).to eq(2147484207) # relative offset of local header
425
+ expect(br.read_n(19)).to eq('second-big-file.bin') # the filename
426
+
427
+ # zip64 specific values for a whole central directory
428
+ expect(br.read_4b).to eq(0x06064b50) # zip64 end of central dir signature
429
+ expect(br.read_8b).to eq(44) # size of zip64 end of central directory record
430
+ expect(br.read_2b).to eq(820) # version made by
431
+ expect(br.read_2b).to eq(45) # version need to extract
432
+ expect(br.read_4b).to eq(0) # number of this disk
433
+ expect(br.read_4b).to eq(0) # another number related to disk
434
+ expect(br.read_8b).to eq(2) # total number of entries in the central directory on this disk
435
+ expect(br.read_8b).to eq(2) # total number of entries in the central directory
436
+ expect(br.read_8b).to eq(129) # size of central directory
437
+ expect(br.read_8b).to eq(4294968927) # starting disk number
438
+ expect(br.read_4b).to eq(0x07064b50) # zip64 end of central dir locator signature
439
+ expect(br.read_4b).to eq(0) # number of disk ...
440
+ expect(br.read_8b).to eq(4294969056) # relative offset zip64
441
+ expect(br.read_4b).to eq(1) # total number of disks
442
+ end
443
+
444
+ it 'writes the central directory for 3 files which, the third of which will require the Zip64 extra since it is past the 4GB offset' do
445
+ zip = described_class.new
446
+ raw_buf = StringIO.new
447
+
448
+ zip_write_buf = IOWrapper.new(raw_buf)
449
+ big1 = 0xFFFFFFFF/2 + 512
450
+ big2 = 0xFFFFFFFF/2 + 1024
451
+ big3 = 0xFFFFFFFF/2 + 1024
452
+ mtime = Time.utc(2016, 7, 17, 13, 48)
453
+
454
+ zip.add_local_file_header(io: zip_write_buf, filename: 'one', crc32: 12345, compressed_size: big1,
455
+ uncompressed_size: big1, storage_mode: 0, mtime: mtime)
456
+ zip_write_buf.advance_position_by(big1)
457
+
458
+ zip.add_local_file_header(io: zip_write_buf, filename: 'two', crc32: 54321, compressed_size: big2,
459
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
460
+ zip_write_buf.advance_position_by(big2)
461
+
462
+ big3_offset = zip_write_buf.tell
463
+
464
+ zip.add_local_file_header(io: zip_write_buf, filename: 'three', crc32: 54321, compressed_size: big2,
465
+ uncompressed_size: big2, storage_mode: 0, mtime: mtime)
466
+ zip_write_buf.advance_position_by(big3)
467
+
468
+ fake_central_dir_offset = zip_write_buf.tell # Grab the position in the underlying buffer
469
+ actual_central_dir_offset = raw_buf.tell # Grab the position in the underlying buffer
470
+
471
+ zip.write_central_directory(zip_write_buf)
472
+
473
+ # Seek to where the central directory begins
474
+ raw_buf.seek(actual_central_dir_offset, IO::SEEK_SET)
475
+
476
+ br = ByteReader.new(raw_buf)
477
+
478
+ # Standard central directory entry (similar to the local file header)
479
+ # Skip over two entries, because the other example has a 1-to-1 repeat of this
480
+ 2.times {
481
+ br.read_4b
482
+ br.read_2b
483
+ br.read_2b
484
+ br.read_2b
485
+ br.read_2b
486
+ br.read_2b
487
+ br.read_2b
488
+ br.read_4b
489
+ br.read_4b
490
+ br.read_4b
491
+ br.read_2b
492
+ br.read_2b
493
+ br.read_2b
494
+ br.read_2b
495
+ br.read_2b
496
+ br.read_4b
497
+ br.read_4b
498
+ br.read_n(3)
499
+ }
500
+
501
+ # Entry for the third file DOES bear the Zip64 extra field
502
+ expect(br.read_4b).to eq(0x02014b50) # Central directory entry sig
503
+ expect(br.read_2b).to eq(820) # version made by
504
+ expect(br.read_2b).to eq(45) # version need to extract (45 for Zip64) - this entry requires it
505
+ expect(br.read_2b).to eq(0) # general purpose bit flag
506
+ expect(br.read_2b).to eq(0) # compression method (stored here)
507
+ expect(br.read_2b).to eq(28160) # last mod file time
508
+ expect(br.read_2b).to eq(18673) # last mod file date
509
+ expect(br.read_4b).to eq(54321) # crc32
510
+ expect(br.read_4b).to eq(0xFFFFFFFF) # compressed size - blanked for Zip64
511
+ expect(br.read_4b).to eq(0xFFFFFFFF) # uncompressed size - blanked for Zip64
512
+ expect(br.read_2b).to eq(5) # filename length
513
+ expect(br.read_2b).to eq(32) # extra field length (length of the ZIp64 extra)
514
+ expect(br.read_2b).to eq(0) # file comment length
515
+ expect(br.read_2b).to eq(0xFFFF) # disk number, with Zip64 must be blanked to the maximum value because of The Unarchiver bug
516
+ expect(br.read_2b).to eq(0) # internal file attributes
517
+ expect(br.read_4b).to eq(2175008768) # external file attributes
518
+ expect(br.read_4b).to eq(4294967295) # relative offset of local header
519
+ expect(br.read_n(5)).to eq('three') # the filename
520
+ # then the Zip64 extra for that last file _only_
521
+ expect(br.read_2b).to eq(0x0001) # Tag for the "extra" block
522
+ expect(br.read_2b).to eq(28) # Size of this "extra" block. For us it will always be 28
523
+ expect(br.read_8b).to eq(big3) # Original uncompressed file size
524
+ expect(br.read_8b).to eq(big3) # Original compressed file size
525
+ expect(br.read_8b).to eq(big3_offset) # Offset of local header record
526
+ expect(br.read_4b).to eq(0) # Number of the disk on which this file starts
527
+
528
+ # zip64 specific values for a whole central directory
529
+ expect(br.read_4b).to eq(0x06064b50) # zip64 end of central dir signature
530
+ expect(br.read_8b).to eq(44) # size of zip64 end of central directory record
531
+ expect(br.read_2b).to eq(820) # version made by
532
+ expect(br.read_2b).to eq(45) # version need to extract
533
+ expect(br.read_4b).to eq(0) # number of this disk
534
+ expect(br.read_4b).to eq(0) # another number related to disk
535
+ expect(br.read_8b).to eq(3) # total number of entries in the central directory on this disk
536
+ expect(br.read_8b).to eq(3) # total number of entries in the central directory
537
+ expect(br.read_8b).to eq(181) # size of central directory
538
+ expect(br.read_8b).to eq(6442453602) # central directory offset from start of disk
229
539
 
230
- skip "Not finished"
540
+ expect(br.read_4b).to eq(0x07064b50) # Zip64 EOCD locator signature
541
+ expect(br.read_4b).to eq(0) # Disk number with the start of central directory
542
+ expect(br.read_8b).to eq(6442453783) # relative offset of the zip64 end of central directory record
543
+ expect(br.read_4b).to eq(1) # total number of disks
231
544
  end
232
-
233
- it 'writes the central directory 1 file that is larger than 4GB'
234
- it 'writes the central directory for 2 files which, together, make the central directory start beyound the 4GB threshold'
235
545
  end
236
546
  end
data/zip_tricks.gemspec CHANGED
@@ -2,16 +2,16 @@
2
2
  # DO NOT EDIT THIS FILE DIRECTLY
3
3
  # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
4
  # -*- encoding: utf-8 -*-
5
- # stub: zip_tricks 2.8.0 ruby lib
5
+ # stub: zip_tricks 2.8.1 ruby lib
6
6
 
7
7
  Gem::Specification.new do |s|
8
8
  s.name = "zip_tricks"
9
- s.version = "2.8.0"
9
+ s.version = "2.8.1"
10
10
 
11
11
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
12
12
  s.require_paths = ["lib"]
13
13
  s.authors = ["Julik Tarkhanov"]
14
- s.date = "2016-07-18"
14
+ s.date = "2016-07-22"
15
15
  s.description = "Makes rubyzip stream, for real"
16
16
  s.email = "me@julik.nl"
17
17
  s.extra_rdoc_files = [
@@ -24,6 +24,7 @@ Gem::Specification.new do |s|
24
24
  ".travis.yml",
25
25
  ".yardopts",
26
26
  "Gemfile",
27
+ "IMPLEMENTATION_DETAILS.md",
27
28
  "LICENSE.txt",
28
29
  "README.md",
29
30
  "Rakefile",
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: zip_tricks
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.8.0
4
+ version: 2.8.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Julik Tarkhanov
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-07-18 00:00:00.000000000 Z
11
+ date: 2016-07-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -161,6 +161,7 @@ files:
161
161
  - ".travis.yml"
162
162
  - ".yardopts"
163
163
  - Gemfile
164
+ - IMPLEMENTATION_DETAILS.md
164
165
  - LICENSE.txt
165
166
  - README.md
166
167
  - Rakefile