zstd-ruby 1.4.5.0 → 1.5.1.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (101) hide show
  1. checksums.yaml +4 -4
  2. data/.github/dependabot.yml +8 -0
  3. data/.github/workflows/ruby.yml +35 -0
  4. data/README.md +2 -2
  5. data/ext/zstdruby/extconf.rb +2 -1
  6. data/ext/zstdruby/libzstd/BUCK +5 -7
  7. data/ext/zstdruby/libzstd/Makefile +225 -222
  8. data/ext/zstdruby/libzstd/README.md +43 -5
  9. data/ext/zstdruby/libzstd/common/bitstream.h +46 -22
  10. data/ext/zstdruby/libzstd/common/compiler.h +182 -22
  11. data/ext/zstdruby/libzstd/common/cpu.h +1 -3
  12. data/ext/zstdruby/libzstd/common/debug.c +1 -1
  13. data/ext/zstdruby/libzstd/common/debug.h +12 -19
  14. data/ext/zstdruby/libzstd/common/entropy_common.c +196 -44
  15. data/ext/zstdruby/libzstd/common/error_private.c +2 -1
  16. data/ext/zstdruby/libzstd/common/error_private.h +82 -3
  17. data/ext/zstdruby/libzstd/common/fse.h +41 -12
  18. data/ext/zstdruby/libzstd/common/fse_decompress.c +139 -22
  19. data/ext/zstdruby/libzstd/common/huf.h +47 -23
  20. data/ext/zstdruby/libzstd/common/mem.h +87 -98
  21. data/ext/zstdruby/libzstd/common/pool.c +23 -17
  22. data/ext/zstdruby/libzstd/common/pool.h +2 -2
  23. data/ext/zstdruby/libzstd/common/portability_macros.h +131 -0
  24. data/ext/zstdruby/libzstd/common/threading.c +6 -5
  25. data/ext/zstdruby/libzstd/common/xxhash.c +6 -846
  26. data/ext/zstdruby/libzstd/common/xxhash.h +5568 -167
  27. data/ext/zstdruby/libzstd/common/zstd_common.c +10 -10
  28. data/ext/zstdruby/libzstd/common/zstd_deps.h +111 -0
  29. data/ext/zstdruby/libzstd/common/zstd_internal.h +189 -142
  30. data/ext/zstdruby/libzstd/common/zstd_trace.h +163 -0
  31. data/ext/zstdruby/libzstd/compress/clevels.h +134 -0
  32. data/ext/zstdruby/libzstd/compress/fse_compress.c +89 -46
  33. data/ext/zstdruby/libzstd/compress/hist.c +27 -29
  34. data/ext/zstdruby/libzstd/compress/hist.h +2 -2
  35. data/ext/zstdruby/libzstd/compress/huf_compress.c +770 -198
  36. data/ext/zstdruby/libzstd/compress/zstd_compress.c +2894 -863
  37. data/ext/zstdruby/libzstd/compress/zstd_compress_internal.h +390 -90
  38. data/ext/zstdruby/libzstd/compress/zstd_compress_literals.c +12 -11
  39. data/ext/zstdruby/libzstd/compress/zstd_compress_literals.h +4 -2
  40. data/ext/zstdruby/libzstd/compress/zstd_compress_sequences.c +31 -8
  41. data/ext/zstdruby/libzstd/compress/zstd_compress_sequences.h +1 -1
  42. data/ext/zstdruby/libzstd/compress/zstd_compress_superblock.c +25 -297
  43. data/ext/zstdruby/libzstd/compress/zstd_compress_superblock.h +1 -1
  44. data/ext/zstdruby/libzstd/compress/zstd_cwksp.h +206 -69
  45. data/ext/zstdruby/libzstd/compress/zstd_double_fast.c +307 -132
  46. data/ext/zstdruby/libzstd/compress/zstd_double_fast.h +1 -1
  47. data/ext/zstdruby/libzstd/compress/zstd_fast.c +322 -143
  48. data/ext/zstdruby/libzstd/compress/zstd_fast.h +1 -1
  49. data/ext/zstdruby/libzstd/compress/zstd_lazy.c +1136 -174
  50. data/ext/zstdruby/libzstd/compress/zstd_lazy.h +59 -1
  51. data/ext/zstdruby/libzstd/compress/zstd_ldm.c +316 -213
  52. data/ext/zstdruby/libzstd/compress/zstd_ldm.h +9 -2
  53. data/ext/zstdruby/libzstd/compress/zstd_ldm_geartab.h +106 -0
  54. data/ext/zstdruby/libzstd/compress/zstd_opt.c +373 -150
  55. data/ext/zstdruby/libzstd/compress/zstd_opt.h +1 -1
  56. data/ext/zstdruby/libzstd/compress/zstdmt_compress.c +152 -444
  57. data/ext/zstdruby/libzstd/compress/zstdmt_compress.h +31 -113
  58. data/ext/zstdruby/libzstd/decompress/huf_decompress.c +1044 -403
  59. data/ext/zstdruby/libzstd/decompress/huf_decompress_amd64.S +571 -0
  60. data/ext/zstdruby/libzstd/decompress/zstd_ddict.c +9 -9
  61. data/ext/zstdruby/libzstd/decompress/zstd_ddict.h +2 -2
  62. data/ext/zstdruby/libzstd/decompress/zstd_decompress.c +450 -105
  63. data/ext/zstdruby/libzstd/decompress/zstd_decompress_block.c +913 -273
  64. data/ext/zstdruby/libzstd/decompress/zstd_decompress_block.h +14 -5
  65. data/ext/zstdruby/libzstd/decompress/zstd_decompress_internal.h +59 -12
  66. data/ext/zstdruby/libzstd/deprecated/zbuff.h +1 -1
  67. data/ext/zstdruby/libzstd/deprecated/zbuff_common.c +1 -1
  68. data/ext/zstdruby/libzstd/deprecated/zbuff_compress.c +24 -4
  69. data/ext/zstdruby/libzstd/deprecated/zbuff_decompress.c +1 -1
  70. data/ext/zstdruby/libzstd/dictBuilder/cover.c +55 -38
  71. data/ext/zstdruby/libzstd/dictBuilder/cover.h +7 -6
  72. data/ext/zstdruby/libzstd/dictBuilder/divsufsort.c +1 -1
  73. data/ext/zstdruby/libzstd/dictBuilder/fastcover.c +43 -34
  74. data/ext/zstdruby/libzstd/dictBuilder/zdict.c +128 -58
  75. data/ext/zstdruby/libzstd/dll/example/Makefile +1 -1
  76. data/ext/zstdruby/libzstd/dll/example/README.md +16 -22
  77. data/ext/zstdruby/libzstd/legacy/zstd_legacy.h +1 -1
  78. data/ext/zstdruby/libzstd/legacy/zstd_v01.c +8 -8
  79. data/ext/zstdruby/libzstd/legacy/zstd_v01.h +1 -1
  80. data/ext/zstdruby/libzstd/legacy/zstd_v02.c +9 -9
  81. data/ext/zstdruby/libzstd/legacy/zstd_v02.h +1 -1
  82. data/ext/zstdruby/libzstd/legacy/zstd_v03.c +9 -9
  83. data/ext/zstdruby/libzstd/legacy/zstd_v03.h +1 -1
  84. data/ext/zstdruby/libzstd/legacy/zstd_v04.c +10 -10
  85. data/ext/zstdruby/libzstd/legacy/zstd_v04.h +1 -1
  86. data/ext/zstdruby/libzstd/legacy/zstd_v05.c +13 -13
  87. data/ext/zstdruby/libzstd/legacy/zstd_v05.h +1 -1
  88. data/ext/zstdruby/libzstd/legacy/zstd_v06.c +13 -13
  89. data/ext/zstdruby/libzstd/legacy/zstd_v06.h +1 -1
  90. data/ext/zstdruby/libzstd/legacy/zstd_v07.c +13 -13
  91. data/ext/zstdruby/libzstd/legacy/zstd_v07.h +1 -1
  92. data/ext/zstdruby/libzstd/libzstd.mk +185 -0
  93. data/ext/zstdruby/libzstd/libzstd.pc.in +4 -3
  94. data/ext/zstdruby/libzstd/modulemap/module.modulemap +4 -0
  95. data/ext/zstdruby/libzstd/{dictBuilder/zdict.h → zdict.h} +154 -7
  96. data/ext/zstdruby/libzstd/zstd.h +699 -214
  97. data/ext/zstdruby/libzstd/{common/zstd_errors.h → zstd_errors.h} +2 -1
  98. data/ext/zstdruby/zstdruby.c +2 -2
  99. data/lib/zstd-ruby/version.rb +1 -1
  100. metadata +15 -6
  101. data/.travis.yml +0 -14
@@ -1,5 +1,5 @@
1
1
  /*
2
- * Copyright (c) 2018-2020, Facebook, Inc.
2
+ * Copyright (c) Facebook, Inc.
3
3
  * All rights reserved.
4
4
  *
5
5
  * This source code is licensed under both the BSD-style license (found in the
@@ -16,24 +16,33 @@
16
16
  #include <string.h> /* memset */
17
17
  #include <time.h> /* clock */
18
18
 
19
+ #ifndef ZDICT_STATIC_LINKING_ONLY
20
+ # define ZDICT_STATIC_LINKING_ONLY
21
+ #endif
22
+
19
23
  #include "../common/mem.h" /* read */
20
24
  #include "../common/pool.h"
21
25
  #include "../common/threading.h"
22
- #include "cover.h"
23
26
  #include "../common/zstd_internal.h" /* includes zstd.h */
24
- #ifndef ZDICT_STATIC_LINKING_ONLY
25
- #define ZDICT_STATIC_LINKING_ONLY
26
- #endif
27
- #include "zdict.h"
27
+ #include "../compress/zstd_compress_internal.h" /* ZSTD_hash*() */
28
+ #include "../zdict.h"
29
+ #include "cover.h"
28
30
 
29
31
 
30
32
  /*-*************************************
31
33
  * Constants
32
34
  ***************************************/
35
+ /**
36
+ * There are 32bit indexes used to ref samples, so limit samples size to 4GB
37
+ * on 64bit builds.
38
+ * For 32bit builds we choose 1 GB.
39
+ * Most 32bit platforms have 2GB user-mode addressable space and we allocate a large
40
+ * contiguous buffer, so 1GB is already a high limit.
41
+ */
33
42
  #define FASTCOVER_MAX_SAMPLES_SIZE (sizeof(size_t) == 8 ? ((unsigned)-1) : ((unsigned)1 GB))
34
43
  #define FASTCOVER_MAX_F 31
35
44
  #define FASTCOVER_MAX_ACCEL 10
36
- #define DEFAULT_SPLITPOINT 0.75
45
+ #define FASTCOVER_DEFAULT_SPLITPOINT 0.75
37
46
  #define DEFAULT_F 20
38
47
  #define DEFAULT_ACCEL 1
39
48
 
@@ -41,50 +50,50 @@
41
50
  /*-*************************************
42
51
  * Console display
43
52
  ***************************************/
44
- static int g_displayLevel = 2;
53
+ #ifndef LOCALDISPLAYLEVEL
54
+ static int g_displayLevel = 0;
55
+ #endif
56
+ #undef DISPLAY
45
57
  #define DISPLAY(...) \
46
58
  { \
47
59
  fprintf(stderr, __VA_ARGS__); \
48
60
  fflush(stderr); \
49
61
  }
62
+ #undef LOCALDISPLAYLEVEL
50
63
  #define LOCALDISPLAYLEVEL(displayLevel, l, ...) \
51
64
  if (displayLevel >= l) { \
52
65
  DISPLAY(__VA_ARGS__); \
53
66
  } /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
67
+ #undef DISPLAYLEVEL
54
68
  #define DISPLAYLEVEL(l, ...) LOCALDISPLAYLEVEL(g_displayLevel, l, __VA_ARGS__)
55
69
 
70
+ #ifndef LOCALDISPLAYUPDATE
71
+ static const clock_t g_refreshRate = CLOCKS_PER_SEC * 15 / 100;
72
+ static clock_t g_time = 0;
73
+ #endif
74
+ #undef LOCALDISPLAYUPDATE
56
75
  #define LOCALDISPLAYUPDATE(displayLevel, l, ...) \
57
76
  if (displayLevel >= l) { \
58
- if ((clock() - g_time > refreshRate) || (displayLevel >= 4)) { \
77
+ if ((clock() - g_time > g_refreshRate) || (displayLevel >= 4)) { \
59
78
  g_time = clock(); \
60
79
  DISPLAY(__VA_ARGS__); \
61
80
  } \
62
81
  }
82
+ #undef DISPLAYUPDATE
63
83
  #define DISPLAYUPDATE(l, ...) LOCALDISPLAYUPDATE(g_displayLevel, l, __VA_ARGS__)
64
- static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100;
65
- static clock_t g_time = 0;
66
84
 
67
85
 
68
86
  /*-*************************************
69
87
  * Hash Functions
70
88
  ***************************************/
71
- static const U64 prime6bytes = 227718039650203ULL;
72
- static size_t ZSTD_hash6(U64 u, U32 h) { return (size_t)(((u << (64-48)) * prime6bytes) >> (64-h)) ; }
73
- static size_t ZSTD_hash6Ptr(const void* p, U32 h) { return ZSTD_hash6(MEM_readLE64(p), h); }
74
-
75
- static const U64 prime8bytes = 0xCF1BBCDCB7A56463ULL;
76
- static size_t ZSTD_hash8(U64 u, U32 h) { return (size_t)(((u) * prime8bytes) >> (64-h)) ; }
77
- static size_t ZSTD_hash8Ptr(const void* p, U32 h) { return ZSTD_hash8(MEM_readLE64(p), h); }
78
-
79
-
80
89
  /**
81
- * Hash the d-byte value pointed to by p and mod 2^f
90
+ * Hash the d-byte value pointed to by p and mod 2^f into the frequency vector
82
91
  */
83
- static size_t FASTCOVER_hashPtrToIndex(const void* p, U32 h, unsigned d) {
92
+ static size_t FASTCOVER_hashPtrToIndex(const void* p, U32 f, unsigned d) {
84
93
  if (d == 6) {
85
- return ZSTD_hash6Ptr(p, h) & ((1 << h) - 1);
94
+ return ZSTD_hash6Ptr(p, f);
86
95
  }
87
- return ZSTD_hash8Ptr(p, h) & ((1 << h) - 1);
96
+ return ZSTD_hash8Ptr(p, f);
88
97
  }
89
98
 
90
99
 
@@ -461,20 +470,20 @@ typedef struct FASTCOVER_tryParameters_data_s {
461
470
  * This function is thread safe if zstd is compiled with multithreaded support.
462
471
  * It takes its parameters as an *OWNING* opaque pointer to support threading.
463
472
  */
464
- static void FASTCOVER_tryParameters(void *opaque)
473
+ static void FASTCOVER_tryParameters(void* opaque)
465
474
  {
466
475
  /* Save parameters as local variables */
467
- FASTCOVER_tryParameters_data_t *const data = (FASTCOVER_tryParameters_data_t *)opaque;
476
+ FASTCOVER_tryParameters_data_t *const data = (FASTCOVER_tryParameters_data_t*)opaque;
468
477
  const FASTCOVER_ctx_t *const ctx = data->ctx;
469
478
  const ZDICT_cover_params_t parameters = data->parameters;
470
479
  size_t dictBufferCapacity = data->dictBufferCapacity;
471
480
  size_t totalCompressedSize = ERROR(GENERIC);
472
481
  /* Initialize array to keep track of frequency of dmer within activeSegment */
473
- U16* segmentFreqs = (U16 *)calloc(((U64)1 << ctx->f), sizeof(U16));
482
+ U16* segmentFreqs = (U16*)calloc(((U64)1 << ctx->f), sizeof(U16));
474
483
  /* Allocate space for hash table, dict, and freqs */
475
- BYTE *const dict = (BYTE * const)malloc(dictBufferCapacity);
484
+ BYTE *const dict = (BYTE*)malloc(dictBufferCapacity);
476
485
  COVER_dictSelection_t selection = COVER_dictSelectionError(ERROR(GENERIC));
477
- U32 *freqs = (U32*) malloc(((U64)1 << ctx->f) * sizeof(U32));
486
+ U32* freqs = (U32*) malloc(((U64)1 << ctx->f) * sizeof(U32));
478
487
  if (!segmentFreqs || !dict || !freqs) {
479
488
  DISPLAYLEVEL(1, "Failed to allocate buffers: out of memory\n");
480
489
  goto _cleanup;
@@ -486,7 +495,7 @@ static void FASTCOVER_tryParameters(void *opaque)
486
495
  parameters, segmentFreqs);
487
496
 
488
497
  const unsigned nbFinalizeSamples = (unsigned)(ctx->nbTrainSamples * ctx->accelParams.finalize / 100);
489
- selection = COVER_selectDict(dict + tail, dictBufferCapacity - tail,
498
+ selection = COVER_selectDict(dict + tail, dictBufferCapacity, dictBufferCapacity - tail,
490
499
  ctx->samples, ctx->samplesSizes, nbFinalizeSamples, ctx->nbTrainSamples, ctx->nbSamples, parameters, ctx->offsets,
491
500
  totalCompressedSize);
492
501
 
@@ -547,7 +556,7 @@ ZDICT_trainFromBuffer_fastCover(void* dictBuffer, size_t dictBufferCapacity,
547
556
  ZDICT_cover_params_t coverParams;
548
557
  FASTCOVER_accel_t accelParams;
549
558
  /* Initialize global data */
550
- g_displayLevel = parameters.zParams.notificationLevel;
559
+ g_displayLevel = (int)parameters.zParams.notificationLevel;
551
560
  /* Assign splitPoint and f if not provided */
552
561
  parameters.splitPoint = 1.0;
553
562
  parameters.f = parameters.f == 0 ? DEFAULT_F : parameters.f;
@@ -617,7 +626,7 @@ ZDICT_optimizeTrainFromBuffer_fastCover(
617
626
  /* constants */
618
627
  const unsigned nbThreads = parameters->nbThreads;
619
628
  const double splitPoint =
620
- parameters->splitPoint <= 0.0 ? DEFAULT_SPLITPOINT : parameters->splitPoint;
629
+ parameters->splitPoint <= 0.0 ? FASTCOVER_DEFAULT_SPLITPOINT : parameters->splitPoint;
621
630
  const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d;
622
631
  const unsigned kMaxD = parameters->d == 0 ? 8 : parameters->d;
623
632
  const unsigned kMinK = parameters->k == 0 ? 50 : parameters->k;
@@ -630,7 +639,7 @@ ZDICT_optimizeTrainFromBuffer_fastCover(
630
639
  const unsigned accel = parameters->accel == 0 ? DEFAULT_ACCEL : parameters->accel;
631
640
  const unsigned shrinkDict = 0;
632
641
  /* Local variables */
633
- const int displayLevel = parameters->zParams.notificationLevel;
642
+ const int displayLevel = (int)parameters->zParams.notificationLevel;
634
643
  unsigned iteration = 1;
635
644
  unsigned d;
636
645
  unsigned k;
@@ -714,7 +723,7 @@ ZDICT_optimizeTrainFromBuffer_fastCover(
714
723
  data->parameters.splitPoint = splitPoint;
715
724
  data->parameters.steps = kSteps;
716
725
  data->parameters.shrinkDict = shrinkDict;
717
- data->parameters.zParams.notificationLevel = g_displayLevel;
726
+ data->parameters.zParams.notificationLevel = (unsigned)g_displayLevel;
718
727
  /* Check the parameters */
719
728
  if (!FASTCOVER_checkParameters(data->parameters, dictBufferCapacity,
720
729
  data->ctx->f, accel)) {
@@ -1,5 +1,5 @@
1
1
  /*
2
- * Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
2
+ * Copyright (c) Yann Collet, Facebook, Inc.
3
3
  * All rights reserved.
4
4
  *
5
5
  * This source code is licensed under both the BSD-style license (found in the
@@ -23,9 +23,13 @@
23
23
  /* Unix Large Files support (>4GB) */
24
24
  #define _FILE_OFFSET_BITS 64
25
25
  #if (defined(__sun__) && (!defined(__LP64__))) /* Sun Solaris 32-bits requires specific definitions */
26
+ # ifndef _LARGEFILE_SOURCE
26
27
  # define _LARGEFILE_SOURCE
28
+ # endif
27
29
  #elif ! defined(__LP64__) /* No point defining Large file for 64 bit */
30
+ # ifndef _LARGEFILE64_SOURCE
28
31
  # define _LARGEFILE64_SOURCE
32
+ # endif
29
33
  #endif
30
34
 
31
35
 
@@ -37,18 +41,19 @@
37
41
  #include <stdio.h> /* fprintf, fopen, ftello64 */
38
42
  #include <time.h> /* clock */
39
43
 
44
+ #ifndef ZDICT_STATIC_LINKING_ONLY
45
+ # define ZDICT_STATIC_LINKING_ONLY
46
+ #endif
47
+ #define HUF_STATIC_LINKING_ONLY
48
+
40
49
  #include "../common/mem.h" /* read */
41
50
  #include "../common/fse.h" /* FSE_normalizeCount, FSE_writeNCount */
42
- #define HUF_STATIC_LINKING_ONLY
43
51
  #include "../common/huf.h" /* HUF_buildCTable, HUF_writeCTable */
44
52
  #include "../common/zstd_internal.h" /* includes zstd.h */
45
53
  #include "../common/xxhash.h" /* XXH64 */
46
- #include "divsufsort.h"
47
- #ifndef ZDICT_STATIC_LINKING_ONLY
48
- # define ZDICT_STATIC_LINKING_ONLY
49
- #endif
50
- #include "zdict.h"
51
54
  #include "../compress/zstd_compress_internal.h" /* ZSTD_loadCEntropy() */
55
+ #include "../zdict.h"
56
+ #include "divsufsort.h"
52
57
 
53
58
 
54
59
  /*-*************************************
@@ -62,14 +67,15 @@
62
67
 
63
68
  #define NOISELENGTH 32
64
69
 
65
- static const int g_compressionLevel_default = 3;
66
70
  static const U32 g_selectivity_default = 9;
67
71
 
68
72
 
69
73
  /*-*************************************
70
74
  * Console display
71
75
  ***************************************/
76
+ #undef DISPLAY
72
77
  #define DISPLAY(...) { fprintf(stderr, __VA_ARGS__); fflush( stderr ); }
78
+ #undef DISPLAYLEVEL
73
79
  #define DISPLAYLEVEL(l, ...) if (notificationLevel>=l) { DISPLAY(__VA_ARGS__); } /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
74
80
 
75
81
  static clock_t ZDICT_clockSpan(clock_t nPrevious) { return clock() - nPrevious; }
@@ -105,20 +111,17 @@ size_t ZDICT_getDictHeaderSize(const void* dictBuffer, size_t dictSize)
105
111
  size_t headerSize;
106
112
  if (dictSize <= 8 || MEM_readLE32(dictBuffer) != ZSTD_MAGIC_DICTIONARY) return ERROR(dictionary_corrupted);
107
113
 
108
- { unsigned offcodeMaxValue = MaxOff;
109
- ZSTD_compressedBlockState_t* bs = (ZSTD_compressedBlockState_t*)malloc(sizeof(ZSTD_compressedBlockState_t));
114
+ { ZSTD_compressedBlockState_t* bs = (ZSTD_compressedBlockState_t*)malloc(sizeof(ZSTD_compressedBlockState_t));
110
115
  U32* wksp = (U32*)malloc(HUF_WORKSPACE_SIZE);
111
- short* offcodeNCount = (short*)malloc((MaxOff+1)*sizeof(short));
112
- if (!bs || !wksp || !offcodeNCount) {
116
+ if (!bs || !wksp) {
113
117
  headerSize = ERROR(memory_allocation);
114
118
  } else {
115
119
  ZSTD_reset_compressedBlockState(bs);
116
- headerSize = ZSTD_loadCEntropy(bs, wksp, offcodeNCount, &offcodeMaxValue, dictBuffer, dictSize);
120
+ headerSize = ZSTD_loadCEntropy(bs, wksp, dictBuffer, dictSize);
117
121
  }
118
122
 
119
123
  free(bs);
120
124
  free(wksp);
121
- free(offcodeNCount);
122
125
  }
123
126
 
124
127
  return headerSize;
@@ -132,22 +135,32 @@ static unsigned ZDICT_NbCommonBytes (size_t val)
132
135
  if (MEM_isLittleEndian()) {
133
136
  if (MEM_64bits()) {
134
137
  # if defined(_MSC_VER) && defined(_WIN64)
135
- unsigned long r = 0;
136
- _BitScanForward64( &r, (U64)val );
137
- return (unsigned)(r>>3);
138
+ if (val != 0) {
139
+ unsigned long r;
140
+ _BitScanForward64(&r, (U64)val);
141
+ return (unsigned)(r >> 3);
142
+ } else {
143
+ /* Should not reach this code path */
144
+ __assume(0);
145
+ }
138
146
  # elif defined(__GNUC__) && (__GNUC__ >= 3)
139
- return (__builtin_ctzll((U64)val) >> 3);
147
+ return (unsigned)(__builtin_ctzll((U64)val) >> 3);
140
148
  # else
141
149
  static const int DeBruijnBytePos[64] = { 0, 0, 0, 0, 0, 1, 1, 2, 0, 3, 1, 3, 1, 4, 2, 7, 0, 2, 3, 6, 1, 5, 3, 5, 1, 3, 4, 4, 2, 5, 6, 7, 7, 0, 1, 2, 3, 3, 4, 6, 2, 6, 5, 5, 3, 4, 5, 6, 7, 1, 2, 4, 6, 4, 4, 5, 7, 2, 6, 5, 7, 6, 7, 7 };
142
150
  return DeBruijnBytePos[((U64)((val & -(long long)val) * 0x0218A392CDABBD3FULL)) >> 58];
143
151
  # endif
144
152
  } else { /* 32 bits */
145
153
  # if defined(_MSC_VER)
146
- unsigned long r=0;
147
- _BitScanForward( &r, (U32)val );
148
- return (unsigned)(r>>3);
154
+ if (val != 0) {
155
+ unsigned long r;
156
+ _BitScanForward(&r, (U32)val);
157
+ return (unsigned)(r >> 3);
158
+ } else {
159
+ /* Should not reach this code path */
160
+ __assume(0);
161
+ }
149
162
  # elif defined(__GNUC__) && (__GNUC__ >= 3)
150
- return (__builtin_ctz((U32)val) >> 3);
163
+ return (unsigned)(__builtin_ctz((U32)val) >> 3);
151
164
  # else
152
165
  static const int DeBruijnBytePos[32] = { 0, 0, 3, 0, 3, 1, 3, 0, 3, 2, 2, 1, 3, 2, 0, 1, 3, 3, 1, 2, 2, 2, 2, 0, 3, 1, 2, 0, 1, 0, 1, 1 };
153
166
  return DeBruijnBytePos[((U32)((val & -(S32)val) * 0x077CB531U)) >> 27];
@@ -156,11 +169,16 @@ static unsigned ZDICT_NbCommonBytes (size_t val)
156
169
  } else { /* Big Endian CPU */
157
170
  if (MEM_64bits()) {
158
171
  # if defined(_MSC_VER) && defined(_WIN64)
159
- unsigned long r = 0;
160
- _BitScanReverse64( &r, val );
161
- return (unsigned)(r>>3);
172
+ if (val != 0) {
173
+ unsigned long r;
174
+ _BitScanReverse64(&r, val);
175
+ return (unsigned)(r >> 3);
176
+ } else {
177
+ /* Should not reach this code path */
178
+ __assume(0);
179
+ }
162
180
  # elif defined(__GNUC__) && (__GNUC__ >= 3)
163
- return (__builtin_clzll(val) >> 3);
181
+ return (unsigned)(__builtin_clzll(val) >> 3);
164
182
  # else
165
183
  unsigned r;
166
184
  const unsigned n32 = sizeof(size_t)*4; /* calculate this way due to compiler complaining in 32-bits mode */
@@ -171,11 +189,16 @@ static unsigned ZDICT_NbCommonBytes (size_t val)
171
189
  # endif
172
190
  } else { /* 32 bits */
173
191
  # if defined(_MSC_VER)
174
- unsigned long r = 0;
175
- _BitScanReverse( &r, (unsigned long)val );
176
- return (unsigned)(r>>3);
192
+ if (val != 0) {
193
+ unsigned long r;
194
+ _BitScanReverse(&r, (unsigned long)val);
195
+ return (unsigned)(r >> 3);
196
+ } else {
197
+ /* Should not reach this code path */
198
+ __assume(0);
199
+ }
177
200
  # elif defined(__GNUC__) && (__GNUC__ >= 3)
178
- return (__builtin_clz((U32)val) >> 3);
201
+ return (unsigned)(__builtin_clz((U32)val) >> 3);
179
202
  # else
180
203
  unsigned r;
181
204
  if (!(val>>16)) { r=2; val>>=8; } else { r=0; val>>=24; }
@@ -232,7 +255,7 @@ static dictItem ZDICT_analyzePos(
232
255
  U32 savings[LLIMIT] = {0};
233
256
  const BYTE* b = (const BYTE*)buffer;
234
257
  size_t maxLength = LLIMIT;
235
- size_t pos = suffix[start];
258
+ size_t pos = (size_t)suffix[start];
236
259
  U32 end = start;
237
260
  dictItem solution;
238
261
 
@@ -366,7 +389,7 @@ static dictItem ZDICT_analyzePos(
366
389
  savings[i] = savings[i-1] + (lengthList[i] * (i-3));
367
390
 
368
391
  DISPLAYLEVEL(4, "Selected dict at position %u, of length %u : saves %u (ratio: %.2f) \n",
369
- (unsigned)pos, (unsigned)maxLength, (unsigned)savings[maxLength], (double)savings[maxLength] / maxLength);
392
+ (unsigned)pos, (unsigned)maxLength, (unsigned)savings[maxLength], (double)savings[maxLength] / (double)maxLength);
370
393
 
371
394
  solution.pos = (U32)pos;
372
395
  solution.length = (U32)maxLength;
@@ -376,7 +399,7 @@ static dictItem ZDICT_analyzePos(
376
399
  { U32 id;
377
400
  for (id=start; id<end; id++) {
378
401
  U32 p, pEnd, length;
379
- U32 const testedPos = suffix[id];
402
+ U32 const testedPos = (U32)suffix[id];
380
403
  if (testedPos == pos)
381
404
  length = solution.length;
382
405
  else {
@@ -439,7 +462,7 @@ static U32 ZDICT_tryMerge(dictItem* table, dictItem elt, U32 eltNbToSkip, const
439
462
 
440
463
  if ((table[u].pos + table[u].length >= elt.pos) && (table[u].pos < elt.pos)) { /* overlap, existing < new */
441
464
  /* append */
442
- int const addedLength = (int)eltEnd - (table[u].pos + table[u].length);
465
+ int const addedLength = (int)eltEnd - (int)(table[u].pos + table[u].length);
443
466
  table[u].savings += elt.length / 8; /* rough approx bonus */
444
467
  if (addedLength > 0) { /* otherwise, elt fully included into existing */
445
468
  table[u].length += addedLength;
@@ -532,6 +555,7 @@ static size_t ZDICT_trainBuffer_legacy(dictItem* dictList, U32 dictListSize,
532
555
  clock_t displayClock = 0;
533
556
  clock_t const refreshRate = CLOCKS_PER_SEC * 3 / 10;
534
557
 
558
+ # undef DISPLAYUPDATE
535
559
  # define DISPLAYUPDATE(l, ...) if (notificationLevel>=l) { \
536
560
  if (ZDICT_clockSpan(displayClock) > refreshRate) \
537
561
  { displayClock = clock(); DISPLAY(__VA_ARGS__); \
@@ -706,7 +730,7 @@ static void ZDICT_flatLit(unsigned* countLit)
706
730
 
707
731
  #define OFFCODE_MAX 30 /* only applicable to first block */
708
732
  static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
709
- unsigned compressionLevel,
733
+ int compressionLevel,
710
734
  const void* srcBuffer, const size_t* fileSizes, unsigned nbFiles,
711
735
  const void* dictBuffer, size_t dictBufferSize,
712
736
  unsigned notificationLevel)
@@ -741,7 +765,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
741
765
  memset(repOffset, 0, sizeof(repOffset));
742
766
  repOffset[1] = repOffset[4] = repOffset[8] = 1;
743
767
  memset(bestRepOffset, 0, sizeof(bestRepOffset));
744
- if (compressionLevel==0) compressionLevel = g_compressionLevel_default;
768
+ if (compressionLevel==0) compressionLevel = ZSTD_CLEVEL_DEFAULT;
745
769
  params = ZSTD_getParams(compressionLevel, averageSampleSize, dictBufferSize);
746
770
 
747
771
  esr.dict = ZSTD_createCDict_advanced(dictBuffer, dictBufferSize, ZSTD_dlm_byRef, ZSTD_dct_rawContent, params.cParams, ZSTD_defaultCMem);
@@ -762,6 +786,13 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
762
786
  pos += fileSizes[u];
763
787
  }
764
788
 
789
+ if (notificationLevel >= 4) {
790
+ /* writeStats */
791
+ DISPLAYLEVEL(4, "Offset Code Frequencies : \n");
792
+ for (u=0; u<=offcodeMax; u++) {
793
+ DISPLAYLEVEL(4, "%2u :%7u \n", u, offcodeCount[u]);
794
+ } }
795
+
765
796
  /* analyze, build stats, starting with literals */
766
797
  { size_t maxNbBits = HUF_buildCTable (hufTable, countLit, 255, huffLog);
767
798
  if (HUF_isError(maxNbBits)) {
@@ -786,7 +817,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
786
817
  /* note : the result of this phase should be used to better appreciate the impact on statistics */
787
818
 
788
819
  total=0; for (u=0; u<=offcodeMax; u++) total+=offcodeCount[u];
789
- errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax);
820
+ errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax, /* useLowProbCount */ 1);
790
821
  if (FSE_isError(errorCode)) {
791
822
  eSize = errorCode;
792
823
  DISPLAYLEVEL(1, "FSE_normalizeCount error with offcodeCount \n");
@@ -795,7 +826,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
795
826
  Offlog = (U32)errorCode;
796
827
 
797
828
  total=0; for (u=0; u<=MaxML; u++) total+=matchLengthCount[u];
798
- errorCode = FSE_normalizeCount(matchLengthNCount, mlLog, matchLengthCount, total, MaxML);
829
+ errorCode = FSE_normalizeCount(matchLengthNCount, mlLog, matchLengthCount, total, MaxML, /* useLowProbCount */ 1);
799
830
  if (FSE_isError(errorCode)) {
800
831
  eSize = errorCode;
801
832
  DISPLAYLEVEL(1, "FSE_normalizeCount error with matchLengthCount \n");
@@ -804,7 +835,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
804
835
  mlLog = (U32)errorCode;
805
836
 
806
837
  total=0; for (u=0; u<=MaxLL; u++) total+=litLengthCount[u];
807
- errorCode = FSE_normalizeCount(litLengthNCount, llLog, litLengthCount, total, MaxLL);
838
+ errorCode = FSE_normalizeCount(litLengthNCount, llLog, litLengthCount, total, MaxLL, /* useLowProbCount */ 1);
808
839
  if (FSE_isError(errorCode)) {
809
840
  eSize = errorCode;
810
841
  DISPLAYLEVEL(1, "FSE_normalizeCount error with litLengthCount \n");
@@ -868,7 +899,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
868
899
  MEM_writeLE32(dstPtr+8, bestRepOffset[2].offset);
869
900
  #else
870
901
  /* at this stage, we don't use the result of "most common first offset",
871
- as the impact of statistics is not properly evaluated */
902
+ * as the impact of statistics is not properly evaluated */
872
903
  MEM_writeLE32(dstPtr+0, repStartValue[0]);
873
904
  MEM_writeLE32(dstPtr+4, repStartValue[1]);
874
905
  MEM_writeLE32(dstPtr+8, repStartValue[2]);
@@ -884,6 +915,17 @@ _cleanup:
884
915
  }
885
916
 
886
917
 
918
+ /**
919
+ * @returns the maximum repcode value
920
+ */
921
+ static U32 ZDICT_maxRep(U32 const reps[ZSTD_REP_NUM])
922
+ {
923
+ U32 maxRep = reps[0];
924
+ int r;
925
+ for (r = 1; r < ZSTD_REP_NUM; ++r)
926
+ maxRep = MAX(maxRep, reps[r]);
927
+ return maxRep;
928
+ }
887
929
 
888
930
  size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity,
889
931
  const void* customDictContent, size_t dictContentSize,
@@ -893,13 +935,15 @@ size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity,
893
935
  size_t hSize;
894
936
  #define HBUFFSIZE 256 /* should prove large enough for all entropy headers */
895
937
  BYTE header[HBUFFSIZE];
896
- int const compressionLevel = (params.compressionLevel == 0) ? g_compressionLevel_default : params.compressionLevel;
938
+ int const compressionLevel = (params.compressionLevel == 0) ? ZSTD_CLEVEL_DEFAULT : params.compressionLevel;
897
939
  U32 const notificationLevel = params.notificationLevel;
940
+ /* The final dictionary content must be at least as large as the largest repcode */
941
+ size_t const minContentSize = (size_t)ZDICT_maxRep(repStartValue);
942
+ size_t paddingSize;
898
943
 
899
944
  /* check conditions */
900
945
  DEBUGLOG(4, "ZDICT_finalizeDictionary");
901
946
  if (dictBufferCapacity < dictContentSize) return ERROR(dstSize_tooSmall);
902
- if (dictContentSize < ZDICT_CONTENTSIZE_MIN) return ERROR(srcSize_wrong);
903
947
  if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) return ERROR(dstSize_tooSmall);
904
948
 
905
949
  /* dictionary header */
@@ -923,12 +967,43 @@ size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity,
923
967
  hSize += eSize;
924
968
  }
925
969
 
926
- /* copy elements in final buffer ; note : src and dst buffer can overlap */
927
- if (hSize + dictContentSize > dictBufferCapacity) dictContentSize = dictBufferCapacity - hSize;
928
- { size_t const dictSize = hSize + dictContentSize;
929
- char* dictEnd = (char*)dictBuffer + dictSize;
930
- memmove(dictEnd - dictContentSize, customDictContent, dictContentSize);
931
- memcpy(dictBuffer, header, hSize);
970
+ /* Shrink the content size if it doesn't fit in the buffer */
971
+ if (hSize + dictContentSize > dictBufferCapacity) {
972
+ dictContentSize = dictBufferCapacity - hSize;
973
+ }
974
+
975
+ /* Pad the dictionary content with zeros if it is too small */
976
+ if (dictContentSize < minContentSize) {
977
+ RETURN_ERROR_IF(hSize + minContentSize > dictBufferCapacity, dstSize_tooSmall,
978
+ "dictBufferCapacity too small to fit max repcode");
979
+ paddingSize = minContentSize - dictContentSize;
980
+ } else {
981
+ paddingSize = 0;
982
+ }
983
+
984
+ {
985
+ size_t const dictSize = hSize + paddingSize + dictContentSize;
986
+
987
+ /* The dictionary consists of the header, optional padding, and the content.
988
+ * The padding comes before the content because the "best" position in the
989
+ * dictionary is the last byte.
990
+ */
991
+ BYTE* const outDictHeader = (BYTE*)dictBuffer;
992
+ BYTE* const outDictPadding = outDictHeader + hSize;
993
+ BYTE* const outDictContent = outDictPadding + paddingSize;
994
+
995
+ assert(dictSize <= dictBufferCapacity);
996
+ assert(outDictContent + dictContentSize == (BYTE*)dictBuffer + dictSize);
997
+
998
+ /* First copy the customDictContent into its final location.
999
+ * `customDictContent` and `dictBuffer` may overlap, so we must
1000
+ * do this before any other writes into the output buffer.
1001
+ * Then copy the header & padding into the output buffer.
1002
+ */
1003
+ memmove(outDictContent, customDictContent, dictContentSize);
1004
+ memcpy(outDictHeader, header, hSize);
1005
+ memset(outDictPadding, 0, paddingSize);
1006
+
932
1007
  return dictSize;
933
1008
  }
934
1009
  }
@@ -939,7 +1014,7 @@ static size_t ZDICT_addEntropyTablesFromBuffer_advanced(
939
1014
  const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
940
1015
  ZDICT_params_t params)
941
1016
  {
942
- int const compressionLevel = (params.compressionLevel == 0) ? g_compressionLevel_default : params.compressionLevel;
1017
+ int const compressionLevel = (params.compressionLevel == 0) ? ZSTD_CLEVEL_DEFAULT : params.compressionLevel;
943
1018
  U32 const notificationLevel = params.notificationLevel;
944
1019
  size_t hSize = 8;
945
1020
 
@@ -968,16 +1043,11 @@ static size_t ZDICT_addEntropyTablesFromBuffer_advanced(
968
1043
  return MIN(dictBufferCapacity, hSize+dictContentSize);
969
1044
  }
970
1045
 
971
- /* Hidden declaration for dbio.c */
972
- size_t ZDICT_trainFromBuffer_unsafe_legacy(
973
- void* dictBuffer, size_t maxDictSize,
974
- const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
975
- ZDICT_legacy_params_t params);
976
1046
  /*! ZDICT_trainFromBuffer_unsafe_legacy() :
977
- * Warning : `samplesBuffer` must be followed by noisy guard band.
1047
+ * Warning : `samplesBuffer` must be followed by noisy guard band !!!
978
1048
  * @return : size of dictionary, or an error code which can be tested with ZDICT_isError()
979
1049
  */
980
- size_t ZDICT_trainFromBuffer_unsafe_legacy(
1050
+ static size_t ZDICT_trainFromBuffer_unsafe_legacy(
981
1051
  void* dictBuffer, size_t maxDictSize,
982
1052
  const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
983
1053
  ZDICT_legacy_params_t params)
@@ -1114,8 +1184,8 @@ size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
1114
1184
  memset(&params, 0, sizeof(params));
1115
1185
  params.d = 8;
1116
1186
  params.steps = 4;
1117
- /* Default to level 6 since no compression level information is available */
1118
- params.zParams.compressionLevel = 3;
1187
+ /* Use default level since no compression level information is available */
1188
+ params.zParams.compressionLevel = ZSTD_CLEVEL_DEFAULT;
1119
1189
  #if defined(DEBUGLEVEL) && (DEBUGLEVEL>=1)
1120
1190
  params.zParams.notificationLevel = DEBUGLEVEL;
1121
1191
  #endif
@@ -1,5 +1,5 @@
1
1
  # ################################################################
2
- # Copyright (c) 2016-2020, Yann Collet, Facebook, Inc.
2
+ # Copyright (c) Yann Collet, Facebook, Inc.
3
3
  # All rights reserved.
4
4
  #
5
5
  # This source code is licensed under both the BSD-style license (found in the
@@ -1,23 +1,21 @@
1
- ZSTD Windows binary package
2
- ====================================
1
+ # ZSTD Windows binary package
3
2
 
4
- #### The package contents
3
+ ## The package contents
5
4
 
6
- - `zstd.exe` : Command Line Utility, supporting gzip-like arguments
7
- - `dll\libzstd.dll` : The ZSTD dynamic library (DLL)
8
- - `dll\libzstd.lib` : The import library of the ZSTD dynamic library (DLL) for Visual C++
9
- - `example\` : The example of usage of the ZSTD library
10
- - `include\` : Header files required by the ZSTD library
5
+ - `zstd.exe` : Command Line Utility, supporting gzip-like arguments
6
+ - `dll\libzstd.dll` : The ZSTD dynamic library (DLL)
7
+ - `dll\libzstd.lib` : The import library of the ZSTD dynamic library (DLL) for Visual C++
8
+ - `example\` : The example of usage of the ZSTD library
9
+ - `include\` : Header files required by the ZSTD library
11
10
  - `static\libzstd_static.lib` : The static ZSTD library (LIB)
12
11
 
13
-
14
- #### Usage of Command Line Interface
12
+ ## Usage of Command Line Interface
15
13
 
16
14
  Command Line Interface (CLI) supports gzip-like arguments.
17
15
  By default CLI takes an input file and compresses it to an output file:
18
- ```
16
+
19
17
  Usage: zstd [arg] [input] [output]
20
- ```
18
+
21
19
  The full list of commands for CLI can be obtained with `-h` or `-H`. The ratio can
22
20
  be improved with commands from `-3` to `-16` but higher levels also have slower
23
21
  compression. CLI includes in-memory compression benchmark module with compression
@@ -25,36 +23,32 @@ levels starting from `-b` and ending with `-e` with iteration time of `-i` secon
25
23
  CLI supports aggregation of parameters i.e. `-b1`, `-e18`, and `-i1` can be joined
26
24
  into `-b1e18i1`.
27
25
 
28
-
29
- #### The example of usage of static and dynamic ZSTD libraries with gcc/MinGW
26
+ ## The example of usage of static and dynamic ZSTD libraries with gcc/MinGW
30
27
 
31
28
  Use `cd example` and `make` to build `fullbench-dll` and `fullbench-lib`.
32
29
  `fullbench-dll` uses a dynamic ZSTD library from the `dll` directory.
33
30
  `fullbench-lib` uses a static ZSTD library from the `lib` directory.
34
31
 
35
-
36
- #### Using ZSTD DLL with gcc/MinGW
32
+ ## Using ZSTD DLL with gcc/MinGW
37
33
 
38
34
  The header files from `include\` and the dynamic library `dll\libzstd.dll`
39
35
  are required to compile a project using gcc/MinGW.
40
36
  The dynamic library has to be added to linking options.
41
37
  It means that if a project that uses ZSTD consists of a single `test-dll.c`
42
38
  file it should be linked with `dll\libzstd.dll`. For example:
43
- ```
39
+
44
40
  gcc $(CFLAGS) -Iinclude\ test-dll.c -o test-dll dll\libzstd.dll
45
- ```
46
- The compiled executable will require ZSTD DLL which is available at `dll\libzstd.dll`.
47
41
 
42
+ The compiled executable will require ZSTD DLL which is available at `dll\libzstd.dll`.
48
43
 
49
- #### The example of usage of static and dynamic ZSTD libraries with Visual C++
44
+ ## The example of usage of static and dynamic ZSTD libraries with Visual C++
50
45
 
51
46
  Open `example\fullbench-dll.sln` to compile `fullbench-dll` that uses a
52
47
  dynamic ZSTD library from the `dll` directory. The solution works with Visual C++
53
48
  2010 or newer. When one will open the solution with Visual C++ newer than 2010
54
49
  then the solution will upgraded to the current version.
55
50
 
56
-
57
- #### Using ZSTD DLL with Visual C++
51
+ ## Using ZSTD DLL with Visual C++
58
52
 
59
53
  The header files from `include\` and the import library `dll\libzstd.lib`
60
54
  are required to compile a project using Visual C++.