zipremove 0.3.0__tar.gz → 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: zipremove
3
- Version: 0.3.0
3
+ Version: 0.4.0
4
4
  Summary: Extend `zipfile` with `remove`-related functionalities
5
5
  Home-page: https://github.com/danny0838/zipremove
6
6
  Author: Danny Lin
@@ -37,6 +37,7 @@ Dynamic: license-file
37
37
  ![Status](https://img.shields.io/pypi/status/zipremove)
38
38
  ![License](https://img.shields.io/github/license/danny0838/zipremove)
39
39
  [![Downloads](https://static.pepy.tech/personalized-badge/zipremove?period=month&left_text=Downloads)](https://pepy.tech/project/zipremove)
40
+ [![Pull request](https://img.shields.io/github/pulls/detail/state/python/cpython/134627)](https://github.com/python/cpython/pull/134627)
40
41
 
41
42
  This package extends `zipfile` with `remove`-related functionalities.
42
43
 
@@ -61,30 +62,32 @@ This package extends `zipfile` with `remove`-related functionalities.
61
62
 
62
63
  * `ZipFile.repack(removed=None, *, strict_descriptor=False[, chunk_size])`
63
64
 
64
- Rewrites the archive to remove stale local file entries, shrinking the ZIP
65
- file size.
65
+ Rewrites the archive to remove stale local file entries, shrinking its file
66
+ size.
66
67
 
67
68
  If *removed* is provided, it must be a sequence of `ZipInfo` objects
68
69
  representing removed entries; only their corresponding local file entries
69
70
  will be removed.
70
71
 
71
- If *removed* is not provided, local file entries no longer referenced in the
72
- central directory will be removed. The algorithm assumes that local file
73
- entries are stored consecutively:
72
+ If *removed* is not provided, the archive is scanned to identify and remove
73
+ local file entries that are no longer referenced in the central directory.
74
+ The algorithm assumes that local file entries (and the central directory,
75
+ which is mostly treated as the "last entry") are stored consecutively:
74
76
 
75
77
  1. Data before the first referenced entry is removed only when it appears to
76
78
  be a sequence of consecutive entries with no extra following bytes; extra
77
- preceeding bytes are preserved.
79
+ preceding bytes are preserved.
78
80
  2. Data between referenced entries is removed only when it appears to
79
81
  be a sequence of consecutive entries with no extra preceding bytes; extra
80
82
  following bytes are preserved.
81
-
82
- ``strict_descriptor=True`` can be provided to skip the slower scan for an
83
- unsigned data descriptor (deprecated in the latest ZIP specification and is
84
- only used by legacy tools) when checking for bytes resembling a valid local
85
- file entry. This improves performance, but may cause some stale local file
86
- entries to be preserved, as any entry using an unsigned descriptor cannot
87
- be detected.
83
+ 3. Entries must not overlap. If any entry's data overlaps with another, a
84
+ `BadZipFile` error is raised and no changes are made.
85
+
86
+ When scanning, setting `strict_descriptor=True` disables detection of any
87
+ entry using an unsigned data descriptor (deprecated in the ZIP specification
88
+ since version 6.3.0, released on 2006-09-29, and used only by some legacy
89
+ tools). This improves performance, but may cause some stale entries to be
90
+ preserved.
88
91
 
89
92
  *chunk_size* may be specified to control the buffer size when moving
90
93
  entry data (default is 1 MiB).
@@ -3,6 +3,7 @@
3
3
  ![Status](https://img.shields.io/pypi/status/zipremove)
4
4
  ![License](https://img.shields.io/github/license/danny0838/zipremove)
5
5
  [![Downloads](https://static.pepy.tech/personalized-badge/zipremove?period=month&left_text=Downloads)](https://pepy.tech/project/zipremove)
6
+ [![Pull request](https://img.shields.io/github/pulls/detail/state/python/cpython/134627)](https://github.com/python/cpython/pull/134627)
6
7
 
7
8
  This package extends `zipfile` with `remove`-related functionalities.
8
9
 
@@ -27,30 +28,32 @@ This package extends `zipfile` with `remove`-related functionalities.
27
28
 
28
29
  * `ZipFile.repack(removed=None, *, strict_descriptor=False[, chunk_size])`
29
30
 
30
- Rewrites the archive to remove stale local file entries, shrinking the ZIP
31
- file size.
31
+ Rewrites the archive to remove stale local file entries, shrinking its file
32
+ size.
32
33
 
33
34
  If *removed* is provided, it must be a sequence of `ZipInfo` objects
34
35
  representing removed entries; only their corresponding local file entries
35
36
  will be removed.
36
37
 
37
- If *removed* is not provided, local file entries no longer referenced in the
38
- central directory will be removed. The algorithm assumes that local file
39
- entries are stored consecutively:
38
+ If *removed* is not provided, the archive is scanned to identify and remove
39
+ local file entries that are no longer referenced in the central directory.
40
+ The algorithm assumes that local file entries (and the central directory,
41
+ which is mostly treated as the "last entry") are stored consecutively:
40
42
 
41
43
  1. Data before the first referenced entry is removed only when it appears to
42
44
  be a sequence of consecutive entries with no extra following bytes; extra
43
- preceeding bytes are preserved.
45
+ preceding bytes are preserved.
44
46
  2. Data between referenced entries is removed only when it appears to
45
47
  be a sequence of consecutive entries with no extra preceding bytes; extra
46
48
  following bytes are preserved.
47
-
48
- ``strict_descriptor=True`` can be provided to skip the slower scan for an
49
- unsigned data descriptor (deprecated in the latest ZIP specification and is
50
- only used by legacy tools) when checking for bytes resembling a valid local
51
- file entry. This improves performance, but may cause some stale local file
52
- entries to be preserved, as any entry using an unsigned descriptor cannot
53
- be detected.
49
+ 3. Entries must not overlap. If any entry's data overlaps with another, a
50
+ `BadZipFile` error is raised and no changes are made.
51
+
52
+ When scanning, setting `strict_descriptor=True` disables detection of any
53
+ entry using an unsigned data descriptor (deprecated in the ZIP specification
54
+ since version 6.3.0, released on 2006-09-29, and used only by some legacy
55
+ tools). This improves performance, but may cause some stale entries to be
56
+ preserved.
54
57
 
55
58
  *chunk_size* may be specified to control the buffer size when moving
56
59
  entry data (default is 1 MiB).
@@ -1,6 +1,6 @@
1
1
  [metadata]
2
2
  name = zipremove
3
- version = 0.3.0
3
+ version = 0.4.0
4
4
  author = Danny Lin
5
5
  author_email = danny0838@gmail.com
6
6
  url = https://github.com/danny0838/zipremove
@@ -15,6 +15,7 @@ from zipfile import (
15
15
  _FH_GENERAL_PURPOSE_FLAG_BITS,
16
16
  _FH_SIGNATURE,
17
17
  _FH_UNCOMPRESSED_SIZE,
18
+ LZMADecompressor,
18
19
  _get_decompressor,
19
20
  crc32,
20
21
  sizeFileHeader,
@@ -29,6 +30,12 @@ except NameError:
29
30
  # polyfill for Python < 3.14
30
31
  ZIP_ZSTANDARD = 93
31
32
 
33
+ try:
34
+ from zipfile import _MASK_ENCRYPTED
35
+ except ImportError:
36
+ # polyfill for Python < 3.11
37
+ _MASK_ENCRYPTED = 1 << 0
38
+
32
39
  try:
33
40
  from zipfile import _MASK_USE_DATA_DESCRIPTOR
34
41
  except ImportError:
@@ -50,18 +57,16 @@ except ImportError:
50
57
  return filename
51
58
 
52
59
  try:
53
- import zipfile
54
- zipfile.LZMADecompressor().unused_data
60
+ LZMADecompressor().unused_data
55
61
  except AttributeError:
56
62
  # polyfill to support LZMADecompressor().unused_data
57
- class LZMADecompressor(zipfile.LZMADecompressor):
58
- @property
59
- def unused_data(self):
60
- try:
61
- return self._decomp.unused_data
62
- except AttributeError:
63
- return b''
64
- zipfile.LZMADecompressor = LZMADecompressor
63
+ @property
64
+ def unused_data(self):
65
+ try:
66
+ return self._decomp.unused_data
67
+ except AttributeError:
68
+ return b''
69
+ LZMADecompressor.unused_data = unused_data
65
70
 
66
71
 
67
72
  class _ZipRepacker:
@@ -399,8 +404,11 @@ class _ZipRepacker:
399
404
 
400
405
  dd = self._scan_data_descriptor(fp, pos, end_offset, zip64)
401
406
  if dd is None and not self.strict_descriptor:
402
- dd = self._scan_data_descriptor_no_sig_by_decompression(
403
- fp, pos, end_offset, zip64, fheader[_FH_COMPRESSION_METHOD])
407
+ if zinfo.flag_bits & _MASK_ENCRYPTED:
408
+ dd = False
409
+ else:
410
+ dd = self._scan_data_descriptor_no_sig_by_decompression(
411
+ fp, pos, end_offset, zip64, fheader[_FH_COMPRESSION_METHOD])
404
412
  if dd is False:
405
413
  dd = self._scan_data_descriptor_no_sig(fp, pos, end_offset, zip64)
406
414
  if dd is None:
@@ -488,6 +496,7 @@ class _ZipRepacker:
488
496
  dd_fmt = '<LQQ' if zip64 else '<LLL'
489
497
  dd_size = struct.calcsize(dd_fmt)
490
498
 
499
+ # early return and prevent potential `fp.read(-1)`
491
500
  if end_offset - dd_size < offset:
492
501
  return None
493
502
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: zipremove
3
- Version: 0.3.0
3
+ Version: 0.4.0
4
4
  Summary: Extend `zipfile` with `remove`-related functionalities
5
5
  Home-page: https://github.com/danny0838/zipremove
6
6
  Author: Danny Lin
@@ -37,6 +37,7 @@ Dynamic: license-file
37
37
  ![Status](https://img.shields.io/pypi/status/zipremove)
38
38
  ![License](https://img.shields.io/github/license/danny0838/zipremove)
39
39
  [![Downloads](https://static.pepy.tech/personalized-badge/zipremove?period=month&left_text=Downloads)](https://pepy.tech/project/zipremove)
40
+ [![Pull request](https://img.shields.io/github/pulls/detail/state/python/cpython/134627)](https://github.com/python/cpython/pull/134627)
40
41
 
41
42
  This package extends `zipfile` with `remove`-related functionalities.
42
43
 
@@ -61,30 +62,32 @@ This package extends `zipfile` with `remove`-related functionalities.
61
62
 
62
63
  * `ZipFile.repack(removed=None, *, strict_descriptor=False[, chunk_size])`
63
64
 
64
- Rewrites the archive to remove stale local file entries, shrinking the ZIP
65
- file size.
65
+ Rewrites the archive to remove stale local file entries, shrinking its file
66
+ size.
66
67
 
67
68
  If *removed* is provided, it must be a sequence of `ZipInfo` objects
68
69
  representing removed entries; only their corresponding local file entries
69
70
  will be removed.
70
71
 
71
- If *removed* is not provided, local file entries no longer referenced in the
72
- central directory will be removed. The algorithm assumes that local file
73
- entries are stored consecutively:
72
+ If *removed* is not provided, the archive is scanned to identify and remove
73
+ local file entries that are no longer referenced in the central directory.
74
+ The algorithm assumes that local file entries (and the central directory,
75
+ which is mostly treated as the "last entry") are stored consecutively:
74
76
 
75
77
  1. Data before the first referenced entry is removed only when it appears to
76
78
  be a sequence of consecutive entries with no extra following bytes; extra
77
- preceeding bytes are preserved.
79
+ preceding bytes are preserved.
78
80
  2. Data between referenced entries is removed only when it appears to
79
81
  be a sequence of consecutive entries with no extra preceding bytes; extra
80
82
  following bytes are preserved.
81
-
82
- ``strict_descriptor=True`` can be provided to skip the slower scan for an
83
- unsigned data descriptor (deprecated in the latest ZIP specification and is
84
- only used by legacy tools) when checking for bytes resembling a valid local
85
- file entry. This improves performance, but may cause some stale local file
86
- entries to be preserved, as any entry using an unsigned descriptor cannot
87
- be detected.
83
+ 3. Entries must not overlap. If any entry's data overlaps with another, a
84
+ `BadZipFile` error is raised and no changes are made.
85
+
86
+ When scanning, setting `strict_descriptor=True` disables detection of any
87
+ entry using an unsigned data descriptor (deprecated in the ZIP specification
88
+ since version 6.3.0, released on 2006-09-29, and used only by some legacy
89
+ tools). This improves performance, but may cause some stale entries to be
90
+ preserved.
88
91
 
89
92
  *chunk_size* may be specified to control the buffer size when moving
90
93
  entry data (default is 1 MiB).
@@ -6,6 +6,7 @@ import sys
6
6
  import unittest
7
7
  import unittest.mock as mock
8
8
  import warnings
9
+ from contextlib import nullcontext
9
10
 
10
11
  import zipremove as zipfile
11
12
 
@@ -610,6 +611,32 @@ class AbstractRepackTests(RepackHelperMixin):
610
611
  with zipfile.ZipFile(TESTFN) as zh:
611
612
  self.assertIsNone(zh.testzip())
612
613
 
614
+ def test_repack_propagation(self):
615
+ """Should call internal API with adequate parameters."""
616
+ self._prepare_zip_from_test_files(TESTFN, self.test_files)
617
+
618
+ with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
619
+ zi = zh.remove(zh.infolist()[0])
620
+ with mock.patch.object(zipfile._ZipRepacker, 'repack') as m_rp:
621
+ zh.repack()
622
+ m_rp.assert_called_once_with(zh, None)
623
+
624
+ with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
625
+ zi = zh.remove(zh.infolist()[0])
626
+ with mock.patch.object(zipfile._ZipRepacker, 'repack') as m_rp:
627
+ zh.repack([zi])
628
+ m_rp.assert_called_once_with(zh, [zi])
629
+
630
+ with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
631
+ with mock.patch.object(zipfile, '_ZipRepacker') as m_rp:
632
+ zh.repack()
633
+ m_rp.assert_called_once_with()
634
+
635
+ with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
636
+ with mock.patch.object(zipfile, '_ZipRepacker') as m_rp:
637
+ zh.repack(strict_descriptor=True, chunk_size=1024)
638
+ m_rp.assert_called_once_with(strict_descriptor=True, chunk_size=1024)
639
+
613
640
  def test_repack_bytes_before_first_file(self):
614
641
  """Should preserve random bytes before the first recorded local file entry."""
615
642
  for ii in ([], [0], [0, 1], [0, 1, 2]):
@@ -847,222 +874,6 @@ class AbstractRepackTests(RepackHelperMixin):
847
874
  with zipfile.ZipFile(TESTFN) as zh:
848
875
  self.assertIsNone(zh.testzip())
849
876
 
850
- @requires_zip64fix()
851
- def test_repack_zip64(self):
852
- """Should correctly handle file entries with zip64."""
853
- for ii in ([0], [0, 1], [1], [2]):
854
- with self.subTest(remove=ii):
855
- # calculate the expected results
856
- test_files = [data for j, data in enumerate(self.test_files) if j not in ii]
857
- expected_zinfos = self._prepare_zip_from_test_files(TESTFN, test_files, force_zip64=True)
858
- expected_size = os.path.getsize(TESTFN)
859
-
860
- # do the removal and check the result
861
- self._prepare_zip_from_test_files(TESTFN, self.test_files, force_zip64=True)
862
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
863
- for i in ii:
864
- zh.remove(self.test_files[i][0])
865
- zh.repack()
866
-
867
- # check infolist
868
- self.assertEqual(
869
- [ComparableZipInfo(zi) for zi in zh.infolist()],
870
- expected_zinfos,
871
- )
872
-
873
- # check file size
874
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
875
-
876
- # make sure the zip file is still valid
877
- with zipfile.ZipFile(TESTFN) as zh:
878
- self.assertIsNone(zh.testzip())
879
-
880
- def test_repack_data_descriptor(self):
881
- """Should correctly handle file entries using data descriptor."""
882
- for ii in ([0], [0, 1], [1], [2]):
883
- with self.subTest(remove=ii):
884
- # calculate the expected results
885
- test_files = [data for j, data in enumerate(self.test_files) if j not in ii]
886
- with open(TESTFN, 'wb') as fh:
887
- expected_zinfos = self._prepare_zip_from_test_files(Unseekable(fh), test_files)
888
- expected_size = os.path.getsize(TESTFN)
889
-
890
- # do the removal and check the result
891
- with open(TESTFN, 'wb') as fh:
892
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files)
893
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
894
- # make sure data descriptor bit is really set (by making zipfile unseekable)
895
- for zi in zh.infolist():
896
- self.assertTrue(zi.flag_bits & 8, f'data descriptor not used: {zi.filename}')
897
-
898
- for i in ii:
899
- zh.remove(self.test_files[i][0])
900
- zh.repack()
901
-
902
- # check infolist
903
- self.assertEqual(
904
- [ComparableZipInfo(zi) for zi in zh.infolist()],
905
- expected_zinfos,
906
- )
907
-
908
- # check file size
909
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
910
-
911
- # make sure the zip file is still valid
912
- with zipfile.ZipFile(TESTFN) as zh:
913
- self.assertIsNone(zh.testzip())
914
-
915
- @requires_zip64fix()
916
- def test_repack_data_descriptor_and_zip64(self):
917
- """Should correctly handle file entries using data descriptor and zip64."""
918
- for ii in ([0], [0, 1], [1], [2]):
919
- with self.subTest(remove=ii):
920
- # calculate the expected results
921
- test_files = [data for j, data in enumerate(self.test_files) if j not in ii]
922
- with open(TESTFN, 'wb') as fh:
923
- expected_zinfos = self._prepare_zip_from_test_files(Unseekable(fh), test_files, force_zip64=True)
924
- expected_size = os.path.getsize(TESTFN)
925
-
926
- # do the removal and check the result
927
- with open(TESTFN, 'wb') as fh:
928
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files, force_zip64=True)
929
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
930
- # make sure data descriptor bit is really set (by making zipfile unseekable)
931
- for zi in zh.infolist():
932
- self.assertTrue(zi.flag_bits & 8, f'data descriptor not used: {zi.filename}')
933
-
934
- for i in ii:
935
- zh.remove(self.test_files[i][0])
936
- zh.repack()
937
-
938
- # check infolist
939
- self.assertEqual(
940
- [ComparableZipInfo(zi) for zi in zh.infolist()],
941
- expected_zinfos,
942
- )
943
-
944
- # check file size
945
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
946
-
947
- # make sure the zip file is still valid
948
- with zipfile.ZipFile(TESTFN) as zh:
949
- self.assertIsNone(zh.testzip())
950
-
951
- def test_repack_data_descriptor_no_sig(self):
952
- """Should correctly handle file entries using data descriptor without signature."""
953
- for ii in ([0], [0, 1], [1], [2]):
954
- with self.subTest(remove=ii):
955
- # calculate the expected results
956
- test_files = [data for j, data in enumerate(self.test_files) if j not in ii]
957
- with open(TESTFN, 'wb') as fh:
958
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
959
- expected_zinfos = self._prepare_zip_from_test_files(Unseekable(fh), test_files)
960
- expected_size = os.path.getsize(TESTFN)
961
-
962
- # do the removal and check the result
963
- with open(TESTFN, 'wb') as fh:
964
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
965
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files)
966
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
967
- # make sure data descriptor bit is really set (by making zipfile unseekable)
968
- for zi in zh.infolist():
969
- self.assertTrue(zi.flag_bits & 8, f'data descriptor flag not set: {zi.filename}')
970
-
971
- for i in ii:
972
- zh.remove(self.test_files[i][0])
973
- zh.repack()
974
-
975
- # check infolist
976
- self.assertEqual(
977
- [ComparableZipInfo(zi) for zi in zh.infolist()],
978
- expected_zinfos,
979
- )
980
-
981
- # check file size
982
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
983
-
984
- # make sure the zip file is still valid
985
- with zipfile.ZipFile(TESTFN) as zh:
986
- self.assertIsNone(zh.testzip())
987
-
988
- def test_repack_data_descriptor_no_sig_strict(self):
989
- """Should skip data descriptor without signature when `strict_descriptor` is set."""
990
- for ii in ([0], [0, 1], [1], [2]):
991
- with self.subTest(remove=ii):
992
- # calculate the expected results
993
- with open(TESTFN, 'wb') as fh:
994
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
995
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files)
996
- with zipfile.ZipFile(TESTFN, 'a') as zh:
997
- for i in ii:
998
- zh.remove(self.test_files[i][0])
999
- expected_zinfos = [ComparableZipInfo(zi) for zi in zh.infolist()]
1000
- expected_size = os.path.getsize(TESTFN)
1001
-
1002
- # do the removal and check the result
1003
- with open(TESTFN, 'wb') as fh:
1004
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
1005
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files)
1006
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
1007
- # make sure data descriptor bit is really set (by making zipfile unseekable)
1008
- for zi in zh.infolist():
1009
- self.assertTrue(zi.flag_bits & 8, f'data descriptor flag not set: {zi.filename}')
1010
-
1011
- for i in ii:
1012
- zh.remove(self.test_files[i][0])
1013
- zh.repack(strict_descriptor=True)
1014
-
1015
- # check infolist
1016
- self.assertEqual(
1017
- [ComparableZipInfo(zi) for zi in zh.infolist()],
1018
- expected_zinfos,
1019
- )
1020
-
1021
- # check file size
1022
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
1023
-
1024
- # make sure the zip file is still valid
1025
- with zipfile.ZipFile(TESTFN) as zh:
1026
- self.assertIsNone(zh.testzip())
1027
-
1028
- @requires_zip64fix()
1029
- def test_repack_data_descriptor_no_sig_and_zip64(self):
1030
- """Should correctly handle file entries using data descriptor without signature and zip64."""
1031
- for ii in ([0], [0, 1], [1], [2]):
1032
- with self.subTest(remove=ii):
1033
- # calculate the expected results
1034
- test_files = [data for j, data in enumerate(self.test_files) if j not in ii]
1035
- with open(TESTFN, 'wb') as fh:
1036
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
1037
- expected_zinfos = self._prepare_zip_from_test_files(Unseekable(fh), test_files, force_zip64=True)
1038
- expected_size = os.path.getsize(TESTFN)
1039
-
1040
- # do the removal and check the result
1041
- with open(TESTFN, 'wb') as fh:
1042
- with mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig):
1043
- self._prepare_zip_from_test_files(Unseekable(fh), self.test_files, force_zip64=True)
1044
- with zipfile.ZipFile(TESTFN, 'a', self.compression) as zh:
1045
- # make sure data descriptor bit is really set (by making zipfile unseekable)
1046
- for zi in zh.infolist():
1047
- self.assertTrue(zi.flag_bits & 8, f'data descriptor flag not set: {zi.filename}')
1048
-
1049
- for i in ii:
1050
- zh.remove(self.test_files[i][0])
1051
- zh.repack()
1052
-
1053
- # check infolist
1054
- self.assertEqual(
1055
- [ComparableZipInfo(zi) for zi in zh.infolist()],
1056
- expected_zinfos,
1057
- )
1058
-
1059
- # check file size
1060
- self.assertEqual(os.path.getsize(TESTFN), expected_size)
1061
-
1062
- # make sure the zip file is still valid
1063
- with zipfile.ZipFile(TESTFN) as zh:
1064
- self.assertIsNone(zh.testzip())
1065
-
1066
877
  def test_repack_prepended_bytes(self):
1067
878
  for ii in ([], [0], [0, 1], [1], [2]):
1068
879
  with self.subTest(remove=ii):
@@ -1575,6 +1386,320 @@ class OtherRepackTests(unittest.TestCase):
1575
1386
  self.assertEqual(fz.read(), expected)
1576
1387
 
1577
1388
  class ZipRepackerTests(unittest.TestCase):
1389
+ def _generate_local_file_entry(self, arcname, raw_bytes,
1390
+ compression=zipfile.ZIP_STORED,
1391
+ force_zip64=False, dd=False, dd_sig=True):
1392
+ fz = io.BytesIO()
1393
+ f = Unseekable(fz) if dd else fz
1394
+ cm = (mock.patch.object(struct, 'pack', side_effect=struct_pack_no_dd_sig)
1395
+ if not dd_sig else nullcontext())
1396
+ with zipfile.ZipFile(f, 'w', compression=compression) as zh:
1397
+ with cm:
1398
+ with zh.open(arcname, 'w', force_zip64=force_zip64) as fh:
1399
+ fh.write(raw_bytes)
1400
+ fz.seek(0)
1401
+ return fz.read()
1402
+
1403
+ def test_validate_local_file_entry_stored(self):
1404
+ self._test_validate_local_file_entry(method=zipfile.ZIP_STORED)
1405
+
1406
+ @requires_zlib()
1407
+ def test_validate_local_file_entry_zlib(self):
1408
+ self._test_validate_local_file_entry(method=zipfile.ZIP_DEFLATED)
1409
+
1410
+ @requires_bz2()
1411
+ def test_validate_local_file_entry_bz2(self):
1412
+ self._test_validate_local_file_entry(method=zipfile.ZIP_BZIP2)
1413
+
1414
+ @requires_lzma()
1415
+ def test_validate_local_file_entry_lzma(self):
1416
+ self._test_validate_local_file_entry(method=zipfile.ZIP_LZMA)
1417
+
1418
+ @requires_zstd()
1419
+ def test_validate_local_file_entry_zstd(self):
1420
+ self._test_validate_local_file_entry(method=zipfile.ZIP_ZSTANDARD)
1421
+
1422
+ def _test_validate_local_file_entry(self, method):
1423
+ repacker = zipfile._ZipRepacker()
1424
+
1425
+ # basic
1426
+ bytes_ = self._generate_local_file_entry(
1427
+ 'file.txt', b'dummy', compression=method)
1428
+ fz = io.BytesIO(bytes_)
1429
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1430
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1431
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1432
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1433
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1434
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1435
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1436
+ self.assertEqual(result, len(bytes_))
1437
+ m_sdd.assert_not_called()
1438
+ m_sddnsbd.assert_not_called()
1439
+ m_sddns.assert_not_called()
1440
+
1441
+ # offset
1442
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1443
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1444
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1445
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1446
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1447
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1448
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_) + 1)
1449
+ self.assertEqual(result, len(bytes_))
1450
+ m_sdd.assert_not_called()
1451
+ m_sddnsbd.assert_not_called()
1452
+ m_sddns.assert_not_called()
1453
+
1454
+ bytes_ = b'pre' + bytes_ + b'post'
1455
+ fz = io.BytesIO(bytes_)
1456
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1457
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1458
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1459
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1460
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1461
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1462
+ result = repacker._validate_local_file_entry(fz, 3, len(bytes_) - 4)
1463
+ self.assertEqual(result, len(bytes_) - 7)
1464
+ m_sdd.assert_not_called()
1465
+ m_sddnsbd.assert_not_called()
1466
+ m_sddns.assert_not_called()
1467
+
1468
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1469
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1470
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1471
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1472
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1473
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1474
+ result = repacker._validate_local_file_entry(fz, 3, len(bytes_))
1475
+ self.assertEqual(result, len(bytes_) - 7)
1476
+ m_sdd.assert_not_called()
1477
+ m_sddnsbd.assert_not_called()
1478
+ m_sddns.assert_not_called()
1479
+
1480
+ # return None if no match at given offset
1481
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1482
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1483
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1484
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1485
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1486
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1487
+ result = repacker._validate_local_file_entry(fz, 2, len(bytes_) - 4)
1488
+ self.assertEqual(result, None)
1489
+ m_sdd.assert_not_called()
1490
+ m_sddnsbd.assert_not_called()
1491
+ m_sddns.assert_not_called()
1492
+
1493
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1494
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1495
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1496
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1497
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1498
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1499
+ result = repacker._validate_local_file_entry(fz, 4, len(bytes_) - 4)
1500
+ self.assertEqual(result, None)
1501
+ m_sdd.assert_not_called()
1502
+ m_sddnsbd.assert_not_called()
1503
+ m_sddns.assert_not_called()
1504
+
1505
+ # return None if no sufficient header length
1506
+ bytes_ = self._generate_local_file_entry(
1507
+ 'file.txt', b'dummy', compression=method)
1508
+ bytes_ = bytes_[:29]
1509
+ fz = io.BytesIO(bytes_)
1510
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1511
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1512
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1513
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1514
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1515
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1516
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1517
+ self.assertEqual(result, None)
1518
+ m_sdd.assert_not_called()
1519
+ m_sddnsbd.assert_not_called()
1520
+ m_sddns.assert_not_called()
1521
+
1522
+ # data descriptor
1523
+ bytes_ = self._generate_local_file_entry(
1524
+ 'file.txt', b'dummy', compression=method, dd=True)
1525
+ fz = io.BytesIO(bytes_)
1526
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1527
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1528
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1529
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1530
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1531
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1532
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1533
+ self.assertEqual(result, len(bytes_))
1534
+ m_sdd.assert_called_once_with(fz, 38, len(bytes_), False)
1535
+ m_sddnsbd.assert_not_called()
1536
+ m_sddns.assert_not_called()
1537
+
1538
+ # data descriptor (unsigned)
1539
+ bytes_ = self._generate_local_file_entry(
1540
+ 'file.txt', b'dummy', compression=method, dd=True, dd_sig=False)
1541
+ fz = io.BytesIO(bytes_)
1542
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1543
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1544
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1545
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1546
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1547
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1548
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1549
+ self.assertEqual(result, len(bytes_))
1550
+ m_sdd.assert_called_once_with(fz, 38, len(bytes_), False)
1551
+ m_sddnsbd.assert_called_once_with(fz, 38, len(bytes_), False, method)
1552
+ if repacker._scan_data_descriptor_no_sig_by_decompression(fz, 38, len(bytes_), False, method):
1553
+ m_sddns.assert_not_called()
1554
+ else:
1555
+ m_sddns.assert_called_once_with(fz, 38, len(bytes_), False)
1556
+
1557
+ # return None for data descriptor (unsigned) if `strict_descriptor=True`
1558
+ repacker = zipfile._ZipRepacker(strict_descriptor=True)
1559
+ bytes_ = self._generate_local_file_entry(
1560
+ 'file.txt', b'dummy', compression=method, dd=True, dd_sig=False)
1561
+ fz = io.BytesIO(bytes_)
1562
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1563
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1564
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1565
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1566
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1567
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1568
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1569
+ self.assertEqual(result, None)
1570
+ m_sdd.assert_called_once_with(fz, 38, len(bytes_), False)
1571
+ m_sddnsbd.assert_not_called()
1572
+ m_sddns.assert_not_called()
1573
+
1574
+ @requires_zip64fix()
1575
+ def test_validate_local_file_entry_zip64_stored(self):
1576
+ self._test_validate_local_file_entry_zip64(method=zipfile.ZIP_STORED)
1577
+
1578
+ @requires_zip64fix()
1579
+ @requires_zlib()
1580
+ def test_validate_local_file_entry_zip64_zlib(self):
1581
+ self._test_validate_local_file_entry_zip64(method=zipfile.ZIP_DEFLATED)
1582
+
1583
+ @requires_zip64fix()
1584
+ @requires_bz2()
1585
+ def test_validate_local_file_entry_zip64_bz2(self):
1586
+ self._test_validate_local_file_entry_zip64(method=zipfile.ZIP_BZIP2)
1587
+
1588
+ @requires_zip64fix()
1589
+ @requires_lzma()
1590
+ def test_validate_local_file_entry_zip64_lzma(self):
1591
+ self._test_validate_local_file_entry_zip64(method=zipfile.ZIP_LZMA)
1592
+
1593
+ @requires_zip64fix()
1594
+ @requires_zstd()
1595
+ def test_validate_local_file_entry_zip64_zstd(self):
1596
+ self._test_validate_local_file_entry_zip64(method=zipfile.ZIP_ZSTANDARD)
1597
+
1598
+ def _test_validate_local_file_entry_zip64(self, method):
1599
+ repacker = zipfile._ZipRepacker()
1600
+
1601
+ # zip64
1602
+ bytes_ = self._generate_local_file_entry(
1603
+ 'file.txt', b'dummy', compression=method, force_zip64=True)
1604
+ fz = io.BytesIO(bytes_)
1605
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1606
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1607
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1608
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1609
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1610
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1611
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1612
+ self.assertEqual(result, len(bytes_))
1613
+ m_sdd.assert_not_called()
1614
+ m_sddnsbd.assert_not_called()
1615
+ m_sddns.assert_not_called()
1616
+
1617
+ # data descriptor + zip64
1618
+ bytes_ = self._generate_local_file_entry(
1619
+ 'file.txt', b'dummy', compression=method, force_zip64=True, dd=True)
1620
+ fz = io.BytesIO(bytes_)
1621
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1622
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1623
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1624
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1625
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1626
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1627
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1628
+ self.assertEqual(result, len(bytes_))
1629
+ m_sdd.assert_called_once_with(fz, 58, len(bytes_), True)
1630
+ m_sddnsbd.assert_not_called()
1631
+ m_sddns.assert_not_called()
1632
+
1633
+ # data descriptor (unsigned) + zip64
1634
+ bytes_ = self._generate_local_file_entry(
1635
+ 'file.txt', b'dummy', compression=method, force_zip64=True, dd=True, dd_sig=False)
1636
+ fz = io.BytesIO(bytes_)
1637
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1638
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1639
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1640
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1641
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1642
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1643
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1644
+ self.assertEqual(result, len(bytes_))
1645
+ m_sdd.assert_called_once_with(fz, 58, len(bytes_), True)
1646
+ m_sddnsbd.assert_called_once_with(fz, 58, len(bytes_), True, method)
1647
+ if repacker._scan_data_descriptor_no_sig_by_decompression(fz, 58, len(bytes_), True, method):
1648
+ m_sddns.assert_not_called()
1649
+ else:
1650
+ m_sddns.assert_called_once_with(fz, 58, len(bytes_), True)
1651
+
1652
+ # return None for data descriptor (unsigned) if `strict_descriptor=True`
1653
+ repacker = zipfile._ZipRepacker(strict_descriptor=True)
1654
+ bytes_ = self._generate_local_file_entry(
1655
+ 'file.txt', b'dummy', compression=method, force_zip64=True, dd=True, dd_sig=False)
1656
+ fz = io.BytesIO(bytes_)
1657
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1658
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1659
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1660
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1661
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1662
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1663
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1664
+ self.assertEqual(result, None)
1665
+ m_sdd.assert_called_once_with(fz, 58, len(bytes_), True)
1666
+ m_sddnsbd.assert_not_called()
1667
+ m_sddns.assert_not_called()
1668
+
1669
+ def test_validate_local_file_entry_encrypted(self):
1670
+ repacker = zipfile._ZipRepacker()
1671
+
1672
+ bytes_ = (
1673
+ b'PK\x03\x04'
1674
+ b'\x14\x00'
1675
+ b'\x09\x00'
1676
+ b'\x08\x00'
1677
+ b'\xAB\x28'
1678
+ b'\xD2\x5A'
1679
+ b'\x00\x00\x00\x00'
1680
+ b'\x00\x00\x00\x00'
1681
+ b'\x00\x00\x00\x00'
1682
+ b'\x08\x00'
1683
+ b'\x00\x00'
1684
+ b'file.txt'
1685
+ b'\x97\xF1\x83\x34\x9D\xC4\x8C\xD3\xED\x79\x8C\xA2\xBB\x49\xFF\x1B\x89'
1686
+ b'\x3F\xF2\xF4\x4F'
1687
+ b'\x11\x00\x00\x00'
1688
+ b'\x05\x00\x00\x00'
1689
+ )
1690
+ fz = io.BytesIO(bytes_)
1691
+ with mock.patch.object(repacker, '_scan_data_descriptor',
1692
+ wraps=repacker._scan_data_descriptor) as m_sdd, \
1693
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig_by_decompression',
1694
+ wraps=repacker._scan_data_descriptor_no_sig_by_decompression) as m_sddnsbd, \
1695
+ mock.patch.object(repacker, '_scan_data_descriptor_no_sig',
1696
+ wraps=repacker._scan_data_descriptor_no_sig) as m_sddns:
1697
+ result = repacker._validate_local_file_entry(fz, 0, len(bytes_))
1698
+ self.assertEqual(result, len(bytes_))
1699
+ m_sdd.assert_called_once_with(fz, 38, len(bytes_), False)
1700
+ m_sddnsbd.assert_not_called()
1701
+ m_sddns.assert_called_once_with(fz, 38, len(bytes_), False)
1702
+
1578
1703
  def test_iter_scan_signature(self):
1579
1704
  bytes_ = b'sig__sig__sig__sig'
1580
1705
  ln = len(bytes_)
@@ -1623,136 +1748,177 @@ class ZipRepackerTests(unittest.TestCase):
1623
1748
 
1624
1749
  def test_scan_data_descriptor(self):
1625
1750
  repacker = zipfile._ZipRepacker()
1626
- SIG = zipfile._DD_SIGNATURE
1751
+
1752
+ sig = zipfile._DD_SIGNATURE
1753
+ raw_bytes = comp_bytes = b'dummy'
1754
+ raw_len = comp_len = len(raw_bytes)
1755
+ raw_crc = zipfile.crc32(raw_bytes)
1627
1756
 
1628
1757
  # basic
1629
- bytes_ = b'dummy' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5)
1758
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1630
1759
  self.assertEqual(
1631
1760
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1632
- (0x4ff4f23f, 5, 5, 16),
1761
+ (raw_crc, comp_len, raw_len, 16),
1633
1762
  )
1634
1763
 
1635
1764
  # return None if no signature
1636
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1765
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1637
1766
  self.assertEqual(
1638
1767
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1639
1768
  None,
1640
1769
  )
1641
1770
 
1642
- # return None if not unpackable
1643
- bytes_ = struct.pack('<L', SIG)
1771
+ # return None if compressed size not match
1772
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len + 1, raw_len)
1644
1773
  self.assertEqual(
1645
1774
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1646
1775
  None,
1647
1776
  )
1648
1777
 
1649
- # return None if compressed size not match
1650
- bytes_ = b'dumm' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5)
1778
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len - 1, raw_len)
1779
+ self.assertEqual(
1780
+ repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1781
+ None,
1782
+ )
1783
+
1784
+ bytes_ = b'1' + comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1785
+ self.assertEqual(
1786
+ repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1787
+ None,
1788
+ )
1789
+
1790
+ bytes_ = comp_bytes[1:] + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1651
1791
  self.assertEqual(
1652
1792
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1653
1793
  None,
1654
1794
  )
1655
1795
 
1656
1796
  # zip64
1657
- bytes_ = b'dummy' + struct.pack('<2L2Q', SIG, 0x4ff4f23f, 5, 5)
1797
+ bytes_ = comp_bytes + struct.pack('<2L2Q', sig, raw_crc, comp_len, raw_len)
1658
1798
  self.assertEqual(
1659
1799
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), True),
1660
- (0x4ff4f23f, 5, 5, 24),
1800
+ (raw_crc, comp_len, raw_len, 24),
1661
1801
  )
1662
1802
 
1663
1803
  # offset
1664
- bytes_ = b'dummy' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5)
1804
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1665
1805
  self.assertEqual(
1666
1806
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 1, len(bytes_), False),
1667
1807
  None,
1668
1808
  )
1669
1809
 
1670
- bytes_ = b'123dummy' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5)
1810
+ bytes_ = b'123' + comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1671
1811
  self.assertEqual(
1672
1812
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1673
1813
  None,
1674
1814
  )
1675
1815
  self.assertEqual(
1676
1816
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 3, len(bytes_), False),
1677
- (0x4ff4f23f, 5, 5, 16),
1817
+ (raw_crc, comp_len, raw_len, 16),
1678
1818
  )
1679
1819
 
1680
1820
  # end_offset
1681
- bytes_ = b'dummy' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5)
1821
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len)
1682
1822
  self.assertEqual(
1683
1823
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_) - 1, False),
1684
1824
  None,
1685
1825
  )
1686
1826
 
1687
- bytes_ = b'dummy' + struct.pack('<4L', SIG, 0x4ff4f23f, 5, 5) + b'123'
1827
+ bytes_ = comp_bytes + struct.pack('<4L', sig, raw_crc, comp_len, raw_len) + b'123'
1688
1828
  self.assertEqual(
1689
1829
  repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_) - 3, False),
1690
- (0x4ff4f23f, 5, 5, 16),
1830
+ (raw_crc, comp_len, raw_len, 16),
1831
+ )
1832
+ self.assertEqual(
1833
+ repacker._scan_data_descriptor(io.BytesIO(bytes_), 0, len(bytes_), False),
1834
+ (raw_crc, comp_len, raw_len, 16),
1691
1835
  )
1692
1836
 
1693
1837
  def test_scan_data_descriptor_no_sig(self):
1694
1838
  repacker = zipfile._ZipRepacker()
1695
1839
 
1840
+ raw_bytes = comp_bytes = b'dummy'
1841
+ raw_len = comp_len = len(raw_bytes)
1842
+ raw_crc = zipfile.crc32(raw_bytes)
1843
+
1696
1844
  # basic
1697
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1845
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1698
1846
  self.assertEqual(
1699
1847
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1700
- (0x4ff4f23f, 5, 5, 12),
1848
+ (raw_crc, comp_len, raw_len, 12),
1701
1849
  )
1702
1850
 
1703
1851
  # return None if compressed size not match
1704
- bytes_ = b'dumm' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1852
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len + 1, raw_len)
1705
1853
  self.assertEqual(
1706
1854
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1707
1855
  None,
1708
1856
  )
1709
1857
 
1710
- # zip64
1711
- bytes_ = b'dummy' + struct.pack('<L2Q', 0x4ff4f23f, 5, 5)
1858
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len - 1, raw_len)
1712
1859
  self.assertEqual(
1713
- repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), True),
1714
- (0x4ff4f23f, 5, 5, 20),
1860
+ repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1861
+ None,
1715
1862
  )
1716
1863
 
1717
- # offset
1718
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1864
+ bytes_ = b'1' + comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1719
1865
  self.assertEqual(
1720
- repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 1, len(bytes_), False),
1866
+ repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1721
1867
  None,
1722
1868
  )
1723
1869
 
1724
- bytes_ = b'123dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1870
+ bytes_ = comp_bytes[1:] + struct.pack('<3L', raw_crc, comp_len, raw_len)
1725
1871
  self.assertEqual(
1726
1872
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1727
1873
  None,
1728
1874
  )
1875
+
1876
+ # zip64
1877
+ bytes_ = comp_bytes + struct.pack('<L2Q', raw_crc, comp_len, raw_len)
1878
+ self.assertEqual(
1879
+ repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), True),
1880
+ (raw_crc, comp_len, raw_len, 20),
1881
+ )
1882
+
1883
+ # offset
1884
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1885
+ self.assertEqual(
1886
+ repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 1, len(bytes_), False),
1887
+ None,
1888
+ )
1889
+
1890
+ bytes_ = b'123' + comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1729
1891
  self.assertEqual(
1730
1892
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 3, len(bytes_), False),
1731
- (0x4ff4f23f, 5, 5, 12),
1893
+ (raw_crc, comp_len, raw_len, 12),
1732
1894
  )
1733
1895
 
1734
1896
  # end_offset
1735
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1897
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1736
1898
  self.assertEqual(
1737
1899
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_) - 1, False),
1738
1900
  None,
1739
1901
  )
1740
1902
 
1741
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5) + b'123'
1903
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len) + b'123'
1742
1904
  self.assertEqual(
1743
1905
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_) - 3, False),
1744
- (0x4ff4f23f, 5, 5, 12),
1906
+ (raw_crc, comp_len, raw_len, 12),
1907
+ )
1908
+ self.assertEqual(
1909
+ repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False),
1910
+ (raw_crc, comp_len, raw_len, 12),
1745
1911
  )
1746
1912
 
1747
1913
  # chunk_size
1748
- bytes_ = b'dummy' + struct.pack('<3L', 0x4ff4f23f, 5, 5)
1914
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1749
1915
  self.assertEqual(
1750
1916
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False, 12),
1751
- (0x4ff4f23f, 5, 5, 12),
1917
+ (raw_crc, comp_len, raw_len, 12),
1752
1918
  )
1753
1919
  self.assertEqual(
1754
1920
  repacker._scan_data_descriptor_no_sig(io.BytesIO(bytes_), 0, len(bytes_), False, 1),
1755
- (0x4ff4f23f, 5, 5, 12),
1921
+ (raw_crc, comp_len, raw_len, 12),
1756
1922
  )
1757
1923
 
1758
1924
  def test_scan_data_descriptor_no_sig_by_decompression_stored(self):
@@ -1781,44 +1947,40 @@ class ZipRepackerTests(unittest.TestCase):
1781
1947
  def _test_scan_data_descriptor_no_sig_by_decompression(self, method):
1782
1948
  repacker = zipfile._ZipRepacker()
1783
1949
 
1784
- compressor = zipfile._get_compressor(method)
1785
-
1786
1950
  raw_bytes = b'dummy'
1787
1951
  raw_len = len(raw_bytes)
1952
+ raw_crc = zipfile.crc32(raw_bytes)
1953
+
1954
+ compressor = zipfile._get_compressor(method)
1788
1955
  comp_bytes = compressor.compress(raw_bytes)
1789
1956
  comp_bytes += compressor.flush()
1790
1957
  comp_len = len(comp_bytes)
1791
1958
 
1792
1959
  # basic
1793
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)
1960
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1794
1961
  self.assertEqual(
1795
1962
  repacker._scan_data_descriptor_no_sig_by_decompression(
1796
1963
  io.BytesIO(bytes_), 0, len(bytes_), False, method),
1797
- (0x4ff4f23f, comp_len, raw_len, 12),
1964
+ (raw_crc, comp_len, raw_len, 12),
1798
1965
  )
1799
1966
 
1800
- # return None if insufficient data length
1801
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)
1967
+ # return None if data length < DD signature
1968
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1802
1969
  self.assertEqual(
1803
1970
  repacker._scan_data_descriptor_no_sig_by_decompression(
1804
- io.BytesIO(bytes_), 0, len(bytes_) - 1, False, method),
1971
+ io.BytesIO(bytes_), 0, 11, False, method),
1805
1972
  None,
1806
1973
  )
1807
1974
 
1808
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)[:-1]
1975
+ # return None if compressed size not match
1976
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len + 1, raw_len)
1809
1977
  self.assertEqual(
1810
1978
  repacker._scan_data_descriptor_no_sig_by_decompression(
1811
1979
  io.BytesIO(bytes_), 0, len(bytes_), False, method),
1812
1980
  None,
1813
1981
  )
1814
- self.assertEqual(
1815
- repacker._scan_data_descriptor_no_sig_by_decompression(
1816
- io.BytesIO(bytes_), 0, len(bytes_) + 1, False, method),
1817
- None,
1818
- )
1819
1982
 
1820
- # return None if compressed size not match
1821
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len - 1, raw_len)
1983
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len - 1, raw_len)
1822
1984
  self.assertEqual(
1823
1985
  repacker._scan_data_descriptor_no_sig_by_decompression(
1824
1986
  io.BytesIO(bytes_), 0, len(bytes_), False, method),
@@ -1826,41 +1988,46 @@ class ZipRepackerTests(unittest.TestCase):
1826
1988
  )
1827
1989
 
1828
1990
  # zip64
1829
- bytes_ = comp_bytes + struct.pack('<L2Q', 0x4ff4f23f, comp_len, raw_len)
1991
+ bytes_ = comp_bytes + struct.pack('<L2Q', raw_crc, comp_len, raw_len)
1830
1992
  self.assertEqual(
1831
1993
  repacker._scan_data_descriptor_no_sig_by_decompression(
1832
1994
  io.BytesIO(bytes_), 0, len(bytes_), True, method),
1833
- (0x4ff4f23f, comp_len, raw_len, 20),
1995
+ (raw_crc, comp_len, raw_len, 20),
1834
1996
  )
1835
1997
 
1836
1998
  # offset
1837
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)
1999
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1838
2000
  self.assertEqual(
1839
2001
  repacker._scan_data_descriptor_no_sig_by_decompression(
1840
2002
  io.BytesIO(bytes_), 1, len(bytes_), False, method),
1841
2003
  None,
1842
2004
  )
1843
2005
 
1844
- bytes_ = b'123' + comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)
2006
+ bytes_ = b'123' + comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1845
2007
  self.assertEqual(
1846
2008
  repacker._scan_data_descriptor_no_sig_by_decompression(
1847
2009
  io.BytesIO(bytes_), 3, len(bytes_), False, method),
1848
- (0x4ff4f23f, comp_len, raw_len, 12),
2010
+ (raw_crc, comp_len, raw_len, 12),
1849
2011
  )
1850
2012
 
1851
2013
  # end_offset
1852
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len)
2014
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len)
1853
2015
  self.assertEqual(
1854
2016
  repacker._scan_data_descriptor_no_sig_by_decompression(
1855
- io.BytesIO(bytes_), 0, len(bytes_) - 2, False, method),
2017
+ io.BytesIO(bytes_), 0, len(bytes_) - 1, False, method),
1856
2018
  None,
1857
2019
  )
1858
2020
 
1859
- bytes_ = comp_bytes + struct.pack('<3L', 0x4ff4f23f, comp_len, raw_len) + b'123'
2021
+ bytes_ = comp_bytes + struct.pack('<3L', raw_crc, comp_len, raw_len) + b'123'
1860
2022
  self.assertEqual(
1861
2023
  repacker._scan_data_descriptor_no_sig_by_decompression(
1862
- io.BytesIO(bytes_), 0, len(bytes_) - 2, False, method),
1863
- (0x4ff4f23f, comp_len, raw_len, 12),
2024
+ io.BytesIO(bytes_), 0, len(bytes_) - 3, False, method),
2025
+ (raw_crc, comp_len, raw_len, 12),
2026
+ )
2027
+ self.assertEqual(
2028
+ repacker._scan_data_descriptor_no_sig_by_decompression(
2029
+ io.BytesIO(bytes_), 0, len(bytes_), False, method),
2030
+ (raw_crc, comp_len, raw_len, 12),
1864
2031
  )
1865
2032
 
1866
2033
  def _test_scan_data_descriptor_no_sig_by_decompression_invalid(self, method):
File without changes
File without changes
File without changes