zstd-ruby 1.4.1.0 → 1.5.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/dependabot.yml +8 -0
- data/.github/workflows/ruby.yml +35 -0
- data/README.md +2 -2
- data/ext/zstdruby/libzstd/BUCK +5 -7
- data/ext/zstdruby/libzstd/Makefile +304 -113
- data/ext/zstdruby/libzstd/README.md +83 -20
- data/ext/zstdruby/libzstd/common/bitstream.h +59 -51
- data/ext/zstdruby/libzstd/common/compiler.h +150 -8
- data/ext/zstdruby/libzstd/common/cpu.h +1 -3
- data/ext/zstdruby/libzstd/common/debug.c +11 -31
- data/ext/zstdruby/libzstd/common/debug.h +22 -49
- data/ext/zstdruby/libzstd/common/entropy_common.c +201 -75
- data/ext/zstdruby/libzstd/common/error_private.c +3 -1
- data/ext/zstdruby/libzstd/common/error_private.h +8 -4
- data/ext/zstdruby/libzstd/common/fse.h +50 -42
- data/ext/zstdruby/libzstd/common/fse_decompress.c +149 -55
- data/ext/zstdruby/libzstd/common/huf.h +43 -39
- data/ext/zstdruby/libzstd/common/mem.h +69 -25
- data/ext/zstdruby/libzstd/common/pool.c +30 -20
- data/ext/zstdruby/libzstd/common/pool.h +3 -3
- data/ext/zstdruby/libzstd/common/threading.c +51 -4
- data/ext/zstdruby/libzstd/common/threading.h +36 -4
- data/ext/zstdruby/libzstd/common/xxhash.c +40 -92
- data/ext/zstdruby/libzstd/common/xxhash.h +12 -32
- data/ext/zstdruby/libzstd/common/zstd_common.c +10 -10
- data/ext/zstdruby/libzstd/common/zstd_deps.h +111 -0
- data/ext/zstdruby/libzstd/common/zstd_internal.h +230 -111
- data/ext/zstdruby/libzstd/common/zstd_trace.h +154 -0
- data/ext/zstdruby/libzstd/compress/fse_compress.c +47 -63
- data/ext/zstdruby/libzstd/compress/hist.c +41 -63
- data/ext/zstdruby/libzstd/compress/hist.h +13 -33
- data/ext/zstdruby/libzstd/compress/huf_compress.c +332 -193
- data/ext/zstdruby/libzstd/compress/zstd_compress.c +3614 -1696
- data/ext/zstdruby/libzstd/compress/zstd_compress_internal.h +546 -86
- data/ext/zstdruby/libzstd/compress/zstd_compress_literals.c +158 -0
- data/ext/zstdruby/libzstd/compress/zstd_compress_literals.h +29 -0
- data/ext/zstdruby/libzstd/compress/zstd_compress_sequences.c +441 -0
- data/ext/zstdruby/libzstd/compress/zstd_compress_sequences.h +54 -0
- data/ext/zstdruby/libzstd/compress/zstd_compress_superblock.c +572 -0
- data/ext/zstdruby/libzstd/compress/zstd_compress_superblock.h +32 -0
- data/ext/zstdruby/libzstd/compress/zstd_cwksp.h +662 -0
- data/ext/zstdruby/libzstd/compress/zstd_double_fast.c +43 -41
- data/ext/zstdruby/libzstd/compress/zstd_double_fast.h +2 -2
- data/ext/zstdruby/libzstd/compress/zstd_fast.c +85 -80
- data/ext/zstdruby/libzstd/compress/zstd_fast.h +2 -2
- data/ext/zstdruby/libzstd/compress/zstd_lazy.c +1184 -111
- data/ext/zstdruby/libzstd/compress/zstd_lazy.h +59 -1
- data/ext/zstdruby/libzstd/compress/zstd_ldm.c +333 -208
- data/ext/zstdruby/libzstd/compress/zstd_ldm.h +15 -3
- data/ext/zstdruby/libzstd/compress/zstd_ldm_geartab.h +103 -0
- data/ext/zstdruby/libzstd/compress/zstd_opt.c +228 -129
- data/ext/zstdruby/libzstd/compress/zstd_opt.h +1 -1
- data/ext/zstdruby/libzstd/compress/zstdmt_compress.c +151 -440
- data/ext/zstdruby/libzstd/compress/zstdmt_compress.h +32 -114
- data/ext/zstdruby/libzstd/decompress/huf_decompress.c +395 -276
- data/ext/zstdruby/libzstd/decompress/zstd_ddict.c +20 -16
- data/ext/zstdruby/libzstd/decompress/zstd_ddict.h +3 -3
- data/ext/zstdruby/libzstd/decompress/zstd_decompress.c +630 -231
- data/ext/zstdruby/libzstd/decompress/zstd_decompress_block.c +606 -380
- data/ext/zstdruby/libzstd/decompress/zstd_decompress_block.h +8 -5
- data/ext/zstdruby/libzstd/decompress/zstd_decompress_internal.h +39 -9
- data/ext/zstdruby/libzstd/deprecated/zbuff.h +9 -8
- data/ext/zstdruby/libzstd/deprecated/zbuff_common.c +2 -2
- data/ext/zstdruby/libzstd/deprecated/zbuff_compress.c +1 -1
- data/ext/zstdruby/libzstd/deprecated/zbuff_decompress.c +1 -1
- data/ext/zstdruby/libzstd/dictBuilder/cover.c +55 -46
- data/ext/zstdruby/libzstd/dictBuilder/cover.h +20 -9
- data/ext/zstdruby/libzstd/dictBuilder/divsufsort.c +1 -1
- data/ext/zstdruby/libzstd/dictBuilder/fastcover.c +43 -31
- data/ext/zstdruby/libzstd/dictBuilder/zdict.c +53 -30
- data/ext/zstdruby/libzstd/dll/example/Makefile +2 -1
- data/ext/zstdruby/libzstd/dll/example/README.md +16 -22
- data/ext/zstdruby/libzstd/legacy/zstd_legacy.h +4 -4
- data/ext/zstdruby/libzstd/legacy/zstd_v01.c +24 -14
- data/ext/zstdruby/libzstd/legacy/zstd_v01.h +1 -1
- data/ext/zstdruby/libzstd/legacy/zstd_v02.c +17 -8
- data/ext/zstdruby/libzstd/legacy/zstd_v02.h +1 -1
- data/ext/zstdruby/libzstd/legacy/zstd_v03.c +17 -8
- data/ext/zstdruby/libzstd/legacy/zstd_v03.h +1 -1
- data/ext/zstdruby/libzstd/legacy/zstd_v04.c +25 -11
- data/ext/zstdruby/libzstd/legacy/zstd_v04.h +1 -1
- data/ext/zstdruby/libzstd/legacy/zstd_v05.c +43 -32
- data/ext/zstdruby/libzstd/legacy/zstd_v05.h +2 -2
- data/ext/zstdruby/libzstd/legacy/zstd_v06.c +27 -19
- data/ext/zstdruby/libzstd/legacy/zstd_v06.h +1 -1
- data/ext/zstdruby/libzstd/legacy/zstd_v07.c +32 -20
- data/ext/zstdruby/libzstd/legacy/zstd_v07.h +1 -1
- data/ext/zstdruby/libzstd/libzstd.pc.in +2 -1
- data/ext/zstdruby/libzstd/{dictBuilder/zdict.h → zdict.h} +201 -31
- data/ext/zstdruby/libzstd/zstd.h +740 -153
- data/ext/zstdruby/libzstd/{common/zstd_errors.h → zstd_errors.h} +3 -1
- data/lib/zstd-ruby/version.rb +1 -1
- data/zstd-ruby.gemspec +1 -1
- metadata +21 -10
- data/.travis.yml +0 -14
|
@@ -1,15 +1,26 @@
|
|
|
1
|
+
/*
|
|
2
|
+
* Copyright (c) Facebook, Inc.
|
|
3
|
+
* All rights reserved.
|
|
4
|
+
*
|
|
5
|
+
* This source code is licensed under both the BSD-style license (found in the
|
|
6
|
+
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
|
7
|
+
* in the COPYING file in the root directory of this source tree).
|
|
8
|
+
* You may select, at your option, one of the above-listed licenses.
|
|
9
|
+
*/
|
|
10
|
+
|
|
11
|
+
#ifndef ZDICT_STATIC_LINKING_ONLY
|
|
12
|
+
# define ZDICT_STATIC_LINKING_ONLY
|
|
13
|
+
#endif
|
|
14
|
+
|
|
1
15
|
#include <stdio.h> /* fprintf */
|
|
2
16
|
#include <stdlib.h> /* malloc, free, qsort */
|
|
3
17
|
#include <string.h> /* memset */
|
|
4
18
|
#include <time.h> /* clock */
|
|
5
|
-
#include "mem.h" /* read */
|
|
6
|
-
#include "pool.h"
|
|
7
|
-
#include "threading.h"
|
|
8
|
-
#include "zstd_internal.h" /* includes zstd.h */
|
|
9
|
-
#
|
|
10
|
-
#define ZDICT_STATIC_LINKING_ONLY
|
|
11
|
-
#endif
|
|
12
|
-
#include "zdict.h"
|
|
19
|
+
#include "../common/mem.h" /* read */
|
|
20
|
+
#include "../common/pool.h"
|
|
21
|
+
#include "../common/threading.h"
|
|
22
|
+
#include "../common/zstd_internal.h" /* includes zstd.h */
|
|
23
|
+
#include "../zdict.h"
|
|
13
24
|
|
|
14
25
|
/**
|
|
15
26
|
* COVER_best_t is used for two purposes:
|
|
@@ -142,6 +153,6 @@ void COVER_dictSelectionFree(COVER_dictSelection_t selection);
|
|
|
142
153
|
* smallest dictionary within a specified regression of the compressed size
|
|
143
154
|
* from the largest dictionary.
|
|
144
155
|
*/
|
|
145
|
-
COVER_dictSelection_t COVER_selectDict(BYTE* customDictContent,
|
|
156
|
+
COVER_dictSelection_t COVER_selectDict(BYTE* customDictContent, size_t dictBufferCapacity,
|
|
146
157
|
size_t dictContentSize, const BYTE* samplesBuffer, const size_t* samplesSizes, unsigned nbFinalizeSamples,
|
|
147
158
|
size_t nbCheckSamples, size_t nbSamples, ZDICT_cover_params_t params, size_t* offsets, size_t totalCompressedSize);
|
|
@@ -1576,7 +1576,7 @@ note:
|
|
|
1576
1576
|
/* Construct the inverse suffix array of type B* suffixes using trsort. */
|
|
1577
1577
|
trsort(ISAb, SA, m, 1);
|
|
1578
1578
|
|
|
1579
|
-
/* Set the sorted order of
|
|
1579
|
+
/* Set the sorted order of type B* suffixes. */
|
|
1580
1580
|
for(i = n - 1, j = m, c0 = T[n - 1]; 0 <= i;) {
|
|
1581
1581
|
for(--i, c1 = c0; (0 <= i) && ((c0 = T[i]) >= c1); --i, c1 = c0) { }
|
|
1582
1582
|
if(0 <= i) {
|
|
@@ -1,3 +1,13 @@
|
|
|
1
|
+
/*
|
|
2
|
+
* Copyright (c) Facebook, Inc.
|
|
3
|
+
* All rights reserved.
|
|
4
|
+
*
|
|
5
|
+
* This source code is licensed under both the BSD-style license (found in the
|
|
6
|
+
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
|
7
|
+
* in the COPYING file in the root directory of this source tree).
|
|
8
|
+
* You may select, at your option, one of the above-listed licenses.
|
|
9
|
+
*/
|
|
10
|
+
|
|
1
11
|
/*-*************************************
|
|
2
12
|
* Dependencies
|
|
3
13
|
***************************************/
|
|
@@ -6,15 +16,17 @@
|
|
|
6
16
|
#include <string.h> /* memset */
|
|
7
17
|
#include <time.h> /* clock */
|
|
8
18
|
|
|
9
|
-
#include "mem.h" /* read */
|
|
10
|
-
#include "pool.h"
|
|
11
|
-
#include "threading.h"
|
|
12
|
-
#include "cover.h"
|
|
13
|
-
#include "zstd_internal.h" /* includes zstd.h */
|
|
14
19
|
#ifndef ZDICT_STATIC_LINKING_ONLY
|
|
15
|
-
#define ZDICT_STATIC_LINKING_ONLY
|
|
20
|
+
# define ZDICT_STATIC_LINKING_ONLY
|
|
16
21
|
#endif
|
|
17
|
-
|
|
22
|
+
|
|
23
|
+
#include "../common/mem.h" /* read */
|
|
24
|
+
#include "../common/pool.h"
|
|
25
|
+
#include "../common/threading.h"
|
|
26
|
+
#include "../common/zstd_internal.h" /* includes zstd.h */
|
|
27
|
+
#include "../compress/zstd_compress_internal.h" /* ZSTD_hash*() */
|
|
28
|
+
#include "../zdict.h"
|
|
29
|
+
#include "cover.h"
|
|
18
30
|
|
|
19
31
|
|
|
20
32
|
/*-*************************************
|
|
@@ -23,7 +35,7 @@
|
|
|
23
35
|
#define FASTCOVER_MAX_SAMPLES_SIZE (sizeof(size_t) == 8 ? ((unsigned)-1) : ((unsigned)1 GB))
|
|
24
36
|
#define FASTCOVER_MAX_F 31
|
|
25
37
|
#define FASTCOVER_MAX_ACCEL 10
|
|
26
|
-
#define
|
|
38
|
+
#define FASTCOVER_DEFAULT_SPLITPOINT 0.75
|
|
27
39
|
#define DEFAULT_F 20
|
|
28
40
|
#define DEFAULT_ACCEL 1
|
|
29
41
|
|
|
@@ -31,50 +43,50 @@
|
|
|
31
43
|
/*-*************************************
|
|
32
44
|
* Console display
|
|
33
45
|
***************************************/
|
|
46
|
+
#ifndef LOCALDISPLAYLEVEL
|
|
34
47
|
static int g_displayLevel = 2;
|
|
48
|
+
#endif
|
|
49
|
+
#undef DISPLAY
|
|
35
50
|
#define DISPLAY(...) \
|
|
36
51
|
{ \
|
|
37
52
|
fprintf(stderr, __VA_ARGS__); \
|
|
38
53
|
fflush(stderr); \
|
|
39
54
|
}
|
|
55
|
+
#undef LOCALDISPLAYLEVEL
|
|
40
56
|
#define LOCALDISPLAYLEVEL(displayLevel, l, ...) \
|
|
41
57
|
if (displayLevel >= l) { \
|
|
42
58
|
DISPLAY(__VA_ARGS__); \
|
|
43
59
|
} /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
|
|
60
|
+
#undef DISPLAYLEVEL
|
|
44
61
|
#define DISPLAYLEVEL(l, ...) LOCALDISPLAYLEVEL(g_displayLevel, l, __VA_ARGS__)
|
|
45
62
|
|
|
63
|
+
#ifndef LOCALDISPLAYUPDATE
|
|
64
|
+
static const clock_t g_refreshRate = CLOCKS_PER_SEC * 15 / 100;
|
|
65
|
+
static clock_t g_time = 0;
|
|
66
|
+
#endif
|
|
67
|
+
#undef LOCALDISPLAYUPDATE
|
|
46
68
|
#define LOCALDISPLAYUPDATE(displayLevel, l, ...) \
|
|
47
69
|
if (displayLevel >= l) { \
|
|
48
|
-
if ((clock() - g_time >
|
|
70
|
+
if ((clock() - g_time > g_refreshRate) || (displayLevel >= 4)) { \
|
|
49
71
|
g_time = clock(); \
|
|
50
72
|
DISPLAY(__VA_ARGS__); \
|
|
51
73
|
} \
|
|
52
74
|
}
|
|
75
|
+
#undef DISPLAYUPDATE
|
|
53
76
|
#define DISPLAYUPDATE(l, ...) LOCALDISPLAYUPDATE(g_displayLevel, l, __VA_ARGS__)
|
|
54
|
-
static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100;
|
|
55
|
-
static clock_t g_time = 0;
|
|
56
77
|
|
|
57
78
|
|
|
58
79
|
/*-*************************************
|
|
59
80
|
* Hash Functions
|
|
60
81
|
***************************************/
|
|
61
|
-
static const U64 prime6bytes = 227718039650203ULL;
|
|
62
|
-
static size_t ZSTD_hash6(U64 u, U32 h) { return (size_t)(((u << (64-48)) * prime6bytes) >> (64-h)) ; }
|
|
63
|
-
static size_t ZSTD_hash6Ptr(const void* p, U32 h) { return ZSTD_hash6(MEM_readLE64(p), h); }
|
|
64
|
-
|
|
65
|
-
static const U64 prime8bytes = 0xCF1BBCDCB7A56463ULL;
|
|
66
|
-
static size_t ZSTD_hash8(U64 u, U32 h) { return (size_t)(((u) * prime8bytes) >> (64-h)) ; }
|
|
67
|
-
static size_t ZSTD_hash8Ptr(const void* p, U32 h) { return ZSTD_hash8(MEM_readLE64(p), h); }
|
|
68
|
-
|
|
69
|
-
|
|
70
82
|
/**
|
|
71
|
-
* Hash the d-byte value pointed to by p and mod 2^f
|
|
83
|
+
* Hash the d-byte value pointed to by p and mod 2^f into the frequency vector
|
|
72
84
|
*/
|
|
73
|
-
static size_t FASTCOVER_hashPtrToIndex(const void* p, U32
|
|
85
|
+
static size_t FASTCOVER_hashPtrToIndex(const void* p, U32 f, unsigned d) {
|
|
74
86
|
if (d == 6) {
|
|
75
|
-
return ZSTD_hash6Ptr(p,
|
|
87
|
+
return ZSTD_hash6Ptr(p, f);
|
|
76
88
|
}
|
|
77
|
-
return ZSTD_hash8Ptr(p,
|
|
89
|
+
return ZSTD_hash8Ptr(p, f);
|
|
78
90
|
}
|
|
79
91
|
|
|
80
92
|
|
|
@@ -451,20 +463,20 @@ typedef struct FASTCOVER_tryParameters_data_s {
|
|
|
451
463
|
* This function is thread safe if zstd is compiled with multithreaded support.
|
|
452
464
|
* It takes its parameters as an *OWNING* opaque pointer to support threading.
|
|
453
465
|
*/
|
|
454
|
-
static void FASTCOVER_tryParameters(void
|
|
466
|
+
static void FASTCOVER_tryParameters(void* opaque)
|
|
455
467
|
{
|
|
456
468
|
/* Save parameters as local variables */
|
|
457
|
-
FASTCOVER_tryParameters_data_t *const data = (FASTCOVER_tryParameters_data_t
|
|
469
|
+
FASTCOVER_tryParameters_data_t *const data = (FASTCOVER_tryParameters_data_t*)opaque;
|
|
458
470
|
const FASTCOVER_ctx_t *const ctx = data->ctx;
|
|
459
471
|
const ZDICT_cover_params_t parameters = data->parameters;
|
|
460
472
|
size_t dictBufferCapacity = data->dictBufferCapacity;
|
|
461
473
|
size_t totalCompressedSize = ERROR(GENERIC);
|
|
462
474
|
/* Initialize array to keep track of frequency of dmer within activeSegment */
|
|
463
|
-
U16* segmentFreqs = (U16
|
|
475
|
+
U16* segmentFreqs = (U16*)calloc(((U64)1 << ctx->f), sizeof(U16));
|
|
464
476
|
/* Allocate space for hash table, dict, and freqs */
|
|
465
|
-
BYTE *const dict = (BYTE
|
|
477
|
+
BYTE *const dict = (BYTE*)malloc(dictBufferCapacity);
|
|
466
478
|
COVER_dictSelection_t selection = COVER_dictSelectionError(ERROR(GENERIC));
|
|
467
|
-
U32
|
|
479
|
+
U32* freqs = (U32*) malloc(((U64)1 << ctx->f) * sizeof(U32));
|
|
468
480
|
if (!segmentFreqs || !dict || !freqs) {
|
|
469
481
|
DISPLAYLEVEL(1, "Failed to allocate buffers: out of memory\n");
|
|
470
482
|
goto _cleanup;
|
|
@@ -476,7 +488,7 @@ static void FASTCOVER_tryParameters(void *opaque)
|
|
|
476
488
|
parameters, segmentFreqs);
|
|
477
489
|
|
|
478
490
|
const unsigned nbFinalizeSamples = (unsigned)(ctx->nbTrainSamples * ctx->accelParams.finalize / 100);
|
|
479
|
-
selection = COVER_selectDict(dict + tail, dictBufferCapacity - tail,
|
|
491
|
+
selection = COVER_selectDict(dict + tail, dictBufferCapacity, dictBufferCapacity - tail,
|
|
480
492
|
ctx->samples, ctx->samplesSizes, nbFinalizeSamples, ctx->nbTrainSamples, ctx->nbSamples, parameters, ctx->offsets,
|
|
481
493
|
totalCompressedSize);
|
|
482
494
|
|
|
@@ -607,7 +619,7 @@ ZDICT_optimizeTrainFromBuffer_fastCover(
|
|
|
607
619
|
/* constants */
|
|
608
620
|
const unsigned nbThreads = parameters->nbThreads;
|
|
609
621
|
const double splitPoint =
|
|
610
|
-
parameters->splitPoint <= 0.0 ?
|
|
622
|
+
parameters->splitPoint <= 0.0 ? FASTCOVER_DEFAULT_SPLITPOINT : parameters->splitPoint;
|
|
611
623
|
const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d;
|
|
612
624
|
const unsigned kMaxD = parameters->d == 0 ? 8 : parameters->d;
|
|
613
625
|
const unsigned kMinK = parameters->k == 0 ? 50 : parameters->k;
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
/*
|
|
2
|
-
* Copyright (c)
|
|
2
|
+
* Copyright (c) Yann Collet, Facebook, Inc.
|
|
3
3
|
* All rights reserved.
|
|
4
4
|
*
|
|
5
5
|
* This source code is licensed under both the BSD-style license (found in the
|
|
@@ -23,9 +23,13 @@
|
|
|
23
23
|
/* Unix Large Files support (>4GB) */
|
|
24
24
|
#define _FILE_OFFSET_BITS 64
|
|
25
25
|
#if (defined(__sun__) && (!defined(__LP64__))) /* Sun Solaris 32-bits requires specific definitions */
|
|
26
|
+
# ifndef _LARGEFILE_SOURCE
|
|
26
27
|
# define _LARGEFILE_SOURCE
|
|
28
|
+
# endif
|
|
27
29
|
#elif ! defined(__LP64__) /* No point defining Large file for 64 bit */
|
|
30
|
+
# ifndef _LARGEFILE64_SOURCE
|
|
28
31
|
# define _LARGEFILE64_SOURCE
|
|
32
|
+
# endif
|
|
29
33
|
#endif
|
|
30
34
|
|
|
31
35
|
|
|
@@ -37,17 +41,19 @@
|
|
|
37
41
|
#include <stdio.h> /* fprintf, fopen, ftello64 */
|
|
38
42
|
#include <time.h> /* clock */
|
|
39
43
|
|
|
40
|
-
#include "mem.h" /* read */
|
|
41
|
-
#include "fse.h" /* FSE_normalizeCount, FSE_writeNCount */
|
|
42
|
-
#define HUF_STATIC_LINKING_ONLY
|
|
43
|
-
#include "huf.h" /* HUF_buildCTable, HUF_writeCTable */
|
|
44
|
-
#include "zstd_internal.h" /* includes zstd.h */
|
|
45
|
-
#include "xxhash.h" /* XXH64 */
|
|
46
|
-
#include "divsufsort.h"
|
|
47
44
|
#ifndef ZDICT_STATIC_LINKING_ONLY
|
|
48
45
|
# define ZDICT_STATIC_LINKING_ONLY
|
|
49
46
|
#endif
|
|
50
|
-
#
|
|
47
|
+
#define HUF_STATIC_LINKING_ONLY
|
|
48
|
+
|
|
49
|
+
#include "../common/mem.h" /* read */
|
|
50
|
+
#include "../common/fse.h" /* FSE_normalizeCount, FSE_writeNCount */
|
|
51
|
+
#include "../common/huf.h" /* HUF_buildCTable, HUF_writeCTable */
|
|
52
|
+
#include "../common/zstd_internal.h" /* includes zstd.h */
|
|
53
|
+
#include "../common/xxhash.h" /* XXH64 */
|
|
54
|
+
#include "../compress/zstd_compress_internal.h" /* ZSTD_loadCEntropy() */
|
|
55
|
+
#include "../zdict.h"
|
|
56
|
+
#include "divsufsort.h"
|
|
51
57
|
|
|
52
58
|
|
|
53
59
|
/*-*************************************
|
|
@@ -61,14 +67,15 @@
|
|
|
61
67
|
|
|
62
68
|
#define NOISELENGTH 32
|
|
63
69
|
|
|
64
|
-
static const int g_compressionLevel_default = 3;
|
|
65
70
|
static const U32 g_selectivity_default = 9;
|
|
66
71
|
|
|
67
72
|
|
|
68
73
|
/*-*************************************
|
|
69
74
|
* Console display
|
|
70
75
|
***************************************/
|
|
76
|
+
#undef DISPLAY
|
|
71
77
|
#define DISPLAY(...) { fprintf(stderr, __VA_ARGS__); fflush( stderr ); }
|
|
78
|
+
#undef DISPLAYLEVEL
|
|
72
79
|
#define DISPLAYLEVEL(l, ...) if (notificationLevel>=l) { DISPLAY(__VA_ARGS__); } /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */
|
|
73
80
|
|
|
74
81
|
static clock_t ZDICT_clockSpan(clock_t nPrevious) { return clock() - nPrevious; }
|
|
@@ -99,6 +106,26 @@ unsigned ZDICT_getDictID(const void* dictBuffer, size_t dictSize)
|
|
|
99
106
|
return MEM_readLE32((const char*)dictBuffer + 4);
|
|
100
107
|
}
|
|
101
108
|
|
|
109
|
+
size_t ZDICT_getDictHeaderSize(const void* dictBuffer, size_t dictSize)
|
|
110
|
+
{
|
|
111
|
+
size_t headerSize;
|
|
112
|
+
if (dictSize <= 8 || MEM_readLE32(dictBuffer) != ZSTD_MAGIC_DICTIONARY) return ERROR(dictionary_corrupted);
|
|
113
|
+
|
|
114
|
+
{ ZSTD_compressedBlockState_t* bs = (ZSTD_compressedBlockState_t*)malloc(sizeof(ZSTD_compressedBlockState_t));
|
|
115
|
+
U32* wksp = (U32*)malloc(HUF_WORKSPACE_SIZE);
|
|
116
|
+
if (!bs || !wksp) {
|
|
117
|
+
headerSize = ERROR(memory_allocation);
|
|
118
|
+
} else {
|
|
119
|
+
ZSTD_reset_compressedBlockState(bs);
|
|
120
|
+
headerSize = ZSTD_loadCEntropy(bs, wksp, dictBuffer, dictSize);
|
|
121
|
+
}
|
|
122
|
+
|
|
123
|
+
free(bs);
|
|
124
|
+
free(wksp);
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
return headerSize;
|
|
128
|
+
}
|
|
102
129
|
|
|
103
130
|
/*-********************************************************
|
|
104
131
|
* Dictionary training functions
|
|
@@ -508,6 +535,7 @@ static size_t ZDICT_trainBuffer_legacy(dictItem* dictList, U32 dictListSize,
|
|
|
508
535
|
clock_t displayClock = 0;
|
|
509
536
|
clock_t const refreshRate = CLOCKS_PER_SEC * 3 / 10;
|
|
510
537
|
|
|
538
|
+
# undef DISPLAYUPDATE
|
|
511
539
|
# define DISPLAYUPDATE(l, ...) if (notificationLevel>=l) { \
|
|
512
540
|
if (ZDICT_clockSpan(displayClock) > refreshRate) \
|
|
513
541
|
{ displayClock = clock(); DISPLAY(__VA_ARGS__); \
|
|
@@ -571,7 +599,7 @@ static void ZDICT_fillNoise(void* buffer, size_t length)
|
|
|
571
599
|
unsigned const prime1 = 2654435761U;
|
|
572
600
|
unsigned const prime2 = 2246822519U;
|
|
573
601
|
unsigned acc = prime1;
|
|
574
|
-
size_t p=0
|
|
602
|
+
size_t p=0;
|
|
575
603
|
for (p=0; p<length; p++) {
|
|
576
604
|
acc *= prime2;
|
|
577
605
|
((unsigned char*)buffer)[p] = (unsigned char)(acc >> 21);
|
|
@@ -588,12 +616,12 @@ typedef struct
|
|
|
588
616
|
|
|
589
617
|
#define MAXREPOFFSET 1024
|
|
590
618
|
|
|
591
|
-
static void ZDICT_countEStats(EStats_ress_t esr, ZSTD_parameters params,
|
|
619
|
+
static void ZDICT_countEStats(EStats_ress_t esr, const ZSTD_parameters* params,
|
|
592
620
|
unsigned* countLit, unsigned* offsetcodeCount, unsigned* matchlengthCount, unsigned* litlengthCount, U32* repOffsets,
|
|
593
621
|
const void* src, size_t srcSize,
|
|
594
622
|
U32 notificationLevel)
|
|
595
623
|
{
|
|
596
|
-
size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_MAX, 1 << params
|
|
624
|
+
size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_MAX, 1 << params->cParams.windowLog);
|
|
597
625
|
size_t cSize;
|
|
598
626
|
|
|
599
627
|
if (srcSize > blockSizeMax) srcSize = blockSizeMax; /* protection vs large samples */
|
|
@@ -682,7 +710,7 @@ static void ZDICT_flatLit(unsigned* countLit)
|
|
|
682
710
|
|
|
683
711
|
#define OFFCODE_MAX 30 /* only applicable to first block */
|
|
684
712
|
static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
685
|
-
|
|
713
|
+
int compressionLevel,
|
|
686
714
|
const void* srcBuffer, const size_t* fileSizes, unsigned nbFiles,
|
|
687
715
|
const void* dictBuffer, size_t dictBufferSize,
|
|
688
716
|
unsigned notificationLevel)
|
|
@@ -717,7 +745,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
|
717
745
|
memset(repOffset, 0, sizeof(repOffset));
|
|
718
746
|
repOffset[1] = repOffset[4] = repOffset[8] = 1;
|
|
719
747
|
memset(bestRepOffset, 0, sizeof(bestRepOffset));
|
|
720
|
-
if (compressionLevel==0) compressionLevel =
|
|
748
|
+
if (compressionLevel==0) compressionLevel = ZSTD_CLEVEL_DEFAULT;
|
|
721
749
|
params = ZSTD_getParams(compressionLevel, averageSampleSize, dictBufferSize);
|
|
722
750
|
|
|
723
751
|
esr.dict = ZSTD_createCDict_advanced(dictBuffer, dictBufferSize, ZSTD_dlm_byRef, ZSTD_dct_rawContent, params.cParams, ZSTD_defaultCMem);
|
|
@@ -731,7 +759,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
|
731
759
|
|
|
732
760
|
/* collect stats on all samples */
|
|
733
761
|
for (u=0; u<nbFiles; u++) {
|
|
734
|
-
ZDICT_countEStats(esr, params,
|
|
762
|
+
ZDICT_countEStats(esr, ¶ms,
|
|
735
763
|
countLit, offcodeCount, matchLengthCount, litLengthCount, repOffset,
|
|
736
764
|
(const char*)srcBuffer + pos, fileSizes[u],
|
|
737
765
|
notificationLevel);
|
|
@@ -762,7 +790,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
|
762
790
|
/* note : the result of this phase should be used to better appreciate the impact on statistics */
|
|
763
791
|
|
|
764
792
|
total=0; for (u=0; u<=offcodeMax; u++) total+=offcodeCount[u];
|
|
765
|
-
errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax);
|
|
793
|
+
errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax, /* useLowProbCount */ 1);
|
|
766
794
|
if (FSE_isError(errorCode)) {
|
|
767
795
|
eSize = errorCode;
|
|
768
796
|
DISPLAYLEVEL(1, "FSE_normalizeCount error with offcodeCount \n");
|
|
@@ -771,7 +799,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
|
771
799
|
Offlog = (U32)errorCode;
|
|
772
800
|
|
|
773
801
|
total=0; for (u=0; u<=MaxML; u++) total+=matchLengthCount[u];
|
|
774
|
-
errorCode = FSE_normalizeCount(matchLengthNCount, mlLog, matchLengthCount, total, MaxML);
|
|
802
|
+
errorCode = FSE_normalizeCount(matchLengthNCount, mlLog, matchLengthCount, total, MaxML, /* useLowProbCount */ 1);
|
|
775
803
|
if (FSE_isError(errorCode)) {
|
|
776
804
|
eSize = errorCode;
|
|
777
805
|
DISPLAYLEVEL(1, "FSE_normalizeCount error with matchLengthCount \n");
|
|
@@ -780,7 +808,7 @@ static size_t ZDICT_analyzeEntropy(void* dstBuffer, size_t maxDstSize,
|
|
|
780
808
|
mlLog = (U32)errorCode;
|
|
781
809
|
|
|
782
810
|
total=0; for (u=0; u<=MaxLL; u++) total+=litLengthCount[u];
|
|
783
|
-
errorCode = FSE_normalizeCount(litLengthNCount, llLog, litLengthCount, total, MaxLL);
|
|
811
|
+
errorCode = FSE_normalizeCount(litLengthNCount, llLog, litLengthCount, total, MaxLL, /* useLowProbCount */ 1);
|
|
784
812
|
if (FSE_isError(errorCode)) {
|
|
785
813
|
eSize = errorCode;
|
|
786
814
|
DISPLAYLEVEL(1, "FSE_normalizeCount error with litLengthCount \n");
|
|
@@ -869,7 +897,7 @@ size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity,
|
|
|
869
897
|
size_t hSize;
|
|
870
898
|
#define HBUFFSIZE 256 /* should prove large enough for all entropy headers */
|
|
871
899
|
BYTE header[HBUFFSIZE];
|
|
872
|
-
int const compressionLevel = (params.compressionLevel == 0) ?
|
|
900
|
+
int const compressionLevel = (params.compressionLevel == 0) ? ZSTD_CLEVEL_DEFAULT : params.compressionLevel;
|
|
873
901
|
U32 const notificationLevel = params.notificationLevel;
|
|
874
902
|
|
|
875
903
|
/* check conditions */
|
|
@@ -915,7 +943,7 @@ static size_t ZDICT_addEntropyTablesFromBuffer_advanced(
|
|
|
915
943
|
const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
|
|
916
944
|
ZDICT_params_t params)
|
|
917
945
|
{
|
|
918
|
-
int const compressionLevel = (params.compressionLevel == 0) ?
|
|
946
|
+
int const compressionLevel = (params.compressionLevel == 0) ? ZSTD_CLEVEL_DEFAULT : params.compressionLevel;
|
|
919
947
|
U32 const notificationLevel = params.notificationLevel;
|
|
920
948
|
size_t hSize = 8;
|
|
921
949
|
|
|
@@ -944,16 +972,11 @@ static size_t ZDICT_addEntropyTablesFromBuffer_advanced(
|
|
|
944
972
|
return MIN(dictBufferCapacity, hSize+dictContentSize);
|
|
945
973
|
}
|
|
946
974
|
|
|
947
|
-
/* Hidden declaration for dbio.c */
|
|
948
|
-
size_t ZDICT_trainFromBuffer_unsafe_legacy(
|
|
949
|
-
void* dictBuffer, size_t maxDictSize,
|
|
950
|
-
const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
|
|
951
|
-
ZDICT_legacy_params_t params);
|
|
952
975
|
/*! ZDICT_trainFromBuffer_unsafe_legacy() :
|
|
953
|
-
* Warning : `samplesBuffer` must be followed by noisy guard band
|
|
976
|
+
* Warning : `samplesBuffer` must be followed by noisy guard band !!!
|
|
954
977
|
* @return : size of dictionary, or an error code which can be tested with ZDICT_isError()
|
|
955
978
|
*/
|
|
956
|
-
size_t ZDICT_trainFromBuffer_unsafe_legacy(
|
|
979
|
+
static size_t ZDICT_trainFromBuffer_unsafe_legacy(
|
|
957
980
|
void* dictBuffer, size_t maxDictSize,
|
|
958
981
|
const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
|
|
959
982
|
ZDICT_legacy_params_t params)
|
|
@@ -1090,8 +1113,8 @@ size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
|
|
|
1090
1113
|
memset(¶ms, 0, sizeof(params));
|
|
1091
1114
|
params.d = 8;
|
|
1092
1115
|
params.steps = 4;
|
|
1093
|
-
/*
|
|
1094
|
-
params.zParams.compressionLevel =
|
|
1116
|
+
/* Use default level since no compression level information is available */
|
|
1117
|
+
params.zParams.compressionLevel = ZSTD_CLEVEL_DEFAULT;
|
|
1095
1118
|
#if defined(DEBUGLEVEL) && (DEBUGLEVEL>=1)
|
|
1096
1119
|
params.zParams.notificationLevel = DEBUGLEVEL;
|
|
1097
1120
|
#endif
|
|
@@ -1,10 +1,11 @@
|
|
|
1
1
|
# ################################################################
|
|
2
|
-
# Copyright (c)
|
|
2
|
+
# Copyright (c) Yann Collet, Facebook, Inc.
|
|
3
3
|
# All rights reserved.
|
|
4
4
|
#
|
|
5
5
|
# This source code is licensed under both the BSD-style license (found in the
|
|
6
6
|
# LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
|
7
7
|
# in the COPYING file in the root directory of this source tree).
|
|
8
|
+
# You may select, at your option, one of the above-listed licenses.
|
|
8
9
|
# ################################################################
|
|
9
10
|
|
|
10
11
|
VOID := /dev/null
|
|
@@ -1,23 +1,21 @@
|
|
|
1
|
-
ZSTD Windows binary package
|
|
2
|
-
====================================
|
|
1
|
+
# ZSTD Windows binary package
|
|
3
2
|
|
|
4
|
-
|
|
3
|
+
## The package contents
|
|
5
4
|
|
|
6
|
-
- `zstd.exe`
|
|
7
|
-
- `dll\libzstd.dll`
|
|
8
|
-
- `dll\libzstd.lib`
|
|
9
|
-
- `example\`
|
|
10
|
-
- `include\`
|
|
5
|
+
- `zstd.exe` : Command Line Utility, supporting gzip-like arguments
|
|
6
|
+
- `dll\libzstd.dll` : The ZSTD dynamic library (DLL)
|
|
7
|
+
- `dll\libzstd.lib` : The import library of the ZSTD dynamic library (DLL) for Visual C++
|
|
8
|
+
- `example\` : The example of usage of the ZSTD library
|
|
9
|
+
- `include\` : Header files required by the ZSTD library
|
|
11
10
|
- `static\libzstd_static.lib` : The static ZSTD library (LIB)
|
|
12
11
|
|
|
13
|
-
|
|
14
|
-
#### Usage of Command Line Interface
|
|
12
|
+
## Usage of Command Line Interface
|
|
15
13
|
|
|
16
14
|
Command Line Interface (CLI) supports gzip-like arguments.
|
|
17
15
|
By default CLI takes an input file and compresses it to an output file:
|
|
18
|
-
|
|
16
|
+
|
|
19
17
|
Usage: zstd [arg] [input] [output]
|
|
20
|
-
|
|
18
|
+
|
|
21
19
|
The full list of commands for CLI can be obtained with `-h` or `-H`. The ratio can
|
|
22
20
|
be improved with commands from `-3` to `-16` but higher levels also have slower
|
|
23
21
|
compression. CLI includes in-memory compression benchmark module with compression
|
|
@@ -25,36 +23,32 @@ levels starting from `-b` and ending with `-e` with iteration time of `-i` secon
|
|
|
25
23
|
CLI supports aggregation of parameters i.e. `-b1`, `-e18`, and `-i1` can be joined
|
|
26
24
|
into `-b1e18i1`.
|
|
27
25
|
|
|
28
|
-
|
|
29
|
-
#### The example of usage of static and dynamic ZSTD libraries with gcc/MinGW
|
|
26
|
+
## The example of usage of static and dynamic ZSTD libraries with gcc/MinGW
|
|
30
27
|
|
|
31
28
|
Use `cd example` and `make` to build `fullbench-dll` and `fullbench-lib`.
|
|
32
29
|
`fullbench-dll` uses a dynamic ZSTD library from the `dll` directory.
|
|
33
30
|
`fullbench-lib` uses a static ZSTD library from the `lib` directory.
|
|
34
31
|
|
|
35
|
-
|
|
36
|
-
#### Using ZSTD DLL with gcc/MinGW
|
|
32
|
+
## Using ZSTD DLL with gcc/MinGW
|
|
37
33
|
|
|
38
34
|
The header files from `include\` and the dynamic library `dll\libzstd.dll`
|
|
39
35
|
are required to compile a project using gcc/MinGW.
|
|
40
36
|
The dynamic library has to be added to linking options.
|
|
41
37
|
It means that if a project that uses ZSTD consists of a single `test-dll.c`
|
|
42
38
|
file it should be linked with `dll\libzstd.dll`. For example:
|
|
43
|
-
|
|
39
|
+
|
|
44
40
|
gcc $(CFLAGS) -Iinclude\ test-dll.c -o test-dll dll\libzstd.dll
|
|
45
|
-
```
|
|
46
|
-
The compiled executable will require ZSTD DLL which is available at `dll\libzstd.dll`.
|
|
47
41
|
|
|
42
|
+
The compiled executable will require ZSTD DLL which is available at `dll\libzstd.dll`.
|
|
48
43
|
|
|
49
|
-
|
|
44
|
+
## The example of usage of static and dynamic ZSTD libraries with Visual C++
|
|
50
45
|
|
|
51
46
|
Open `example\fullbench-dll.sln` to compile `fullbench-dll` that uses a
|
|
52
47
|
dynamic ZSTD library from the `dll` directory. The solution works with Visual C++
|
|
53
48
|
2010 or newer. When one will open the solution with Visual C++ newer than 2010
|
|
54
49
|
then the solution will upgraded to the current version.
|
|
55
50
|
|
|
56
|
-
|
|
57
|
-
#### Using ZSTD DLL with Visual C++
|
|
51
|
+
## Using ZSTD DLL with Visual C++
|
|
58
52
|
|
|
59
53
|
The header files from `include\` and the import library `dll\libzstd.lib`
|
|
60
54
|
are required to compile a project using Visual C++.
|