snappy 0.0.10-java → 0.0.11-java
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.gitmodules +3 -0
- data/Rakefile +12 -13
- data/ext/extconf.rb +22 -31
- data/lib/snappy/reader.rb +10 -7
- data/lib/snappy/version.rb +1 -1
- data/snappy.gemspec +24 -0
- data/test/test-snappy-reader.rb +16 -0
- data/vendor/snappy/AUTHORS +1 -0
- data/vendor/snappy/COPYING +54 -0
- data/vendor/snappy/ChangeLog +1916 -0
- data/vendor/snappy/Makefile.am +23 -0
- data/vendor/snappy/NEWS +128 -0
- data/vendor/snappy/README +135 -0
- data/vendor/snappy/autogen.sh +7 -0
- data/vendor/snappy/configure.ac +133 -0
- data/vendor/snappy/format_description.txt +110 -0
- data/vendor/snappy/framing_format.txt +135 -0
- data/vendor/snappy/m4/gtest.m4 +74 -0
- data/vendor/snappy/snappy-c.cc +90 -0
- data/vendor/snappy/snappy-c.h +138 -0
- data/vendor/snappy/snappy-internal.h +150 -0
- data/vendor/snappy/snappy-sinksource.cc +71 -0
- data/vendor/snappy/snappy-sinksource.h +137 -0
- data/vendor/snappy/snappy-stubs-internal.cc +42 -0
- data/vendor/snappy/snappy-stubs-internal.h +491 -0
- data/vendor/snappy/snappy-stubs-public.h.in +98 -0
- data/vendor/snappy/snappy-test.cc +606 -0
- data/vendor/snappy/snappy-test.h +582 -0
- data/vendor/snappy/snappy.cc +1306 -0
- data/vendor/snappy/snappy.h +184 -0
- data/vendor/snappy/snappy_unittest.cc +1355 -0
- data/vendor/snappy/testdata/alice29.txt +3609 -0
- data/vendor/snappy/testdata/asyoulik.txt +4122 -0
- data/vendor/snappy/testdata/baddata1.snappy +0 -0
- data/vendor/snappy/testdata/baddata2.snappy +0 -0
- data/vendor/snappy/testdata/baddata3.snappy +0 -0
- data/vendor/snappy/testdata/fireworks.jpeg +0 -0
- data/vendor/snappy/testdata/geo.protodata +0 -0
- data/vendor/snappy/testdata/html +1 -0
- data/vendor/snappy/testdata/html_x_4 +1 -0
- data/vendor/snappy/testdata/kppkn.gtb +0 -0
- data/vendor/snappy/testdata/lcet10.txt +7519 -0
- data/vendor/snappy/testdata/paper-100k.pdf +600 -2
- data/vendor/snappy/testdata/plrabn12.txt +10699 -0
- data/vendor/snappy/testdata/urls.10K +10000 -0
- metadata +57 -18
@@ -0,0 +1,23 @@
|
|
1
|
+
ACLOCAL_AMFLAGS = -I m4
|
2
|
+
|
3
|
+
# Library.
|
4
|
+
lib_LTLIBRARIES = libsnappy.la
|
5
|
+
libsnappy_la_SOURCES = snappy.cc snappy-sinksource.cc snappy-stubs-internal.cc snappy-c.cc
|
6
|
+
libsnappy_la_LDFLAGS = -version-info $(SNAPPY_LTVERSION)
|
7
|
+
|
8
|
+
include_HEADERS = snappy.h snappy-sinksource.h snappy-stubs-public.h snappy-c.h
|
9
|
+
noinst_HEADERS = snappy-internal.h snappy-stubs-internal.h snappy-test.h
|
10
|
+
|
11
|
+
# Unit tests and benchmarks.
|
12
|
+
snappy_unittest_CPPFLAGS = $(gflags_CFLAGS) $(GTEST_CPPFLAGS)
|
13
|
+
snappy_unittest_SOURCES = snappy_unittest.cc snappy-test.cc
|
14
|
+
snappy_unittest_LDFLAGS = $(GTEST_LDFLAGS)
|
15
|
+
snappy_unittest_LDADD = libsnappy.la $(UNITTEST_LIBS) $(gflags_LIBS) $(GTEST_LIBS)
|
16
|
+
TESTS = snappy_unittest
|
17
|
+
noinst_PROGRAMS = $(TESTS)
|
18
|
+
|
19
|
+
EXTRA_DIST = autogen.sh testdata/alice29.txt testdata/asyoulik.txt testdata/baddata1.snappy testdata/baddata2.snappy testdata/baddata3.snappy testdata/geo.protodata testdata/fireworks.jpeg testdata/html testdata/html_x_4 testdata/kppkn.gtb testdata/lcet10.txt testdata/paper-100k.pdf testdata/plrabn12.txt testdata/urls.10K
|
20
|
+
dist_doc_DATA = ChangeLog COPYING INSTALL NEWS README format_description.txt framing_format.txt
|
21
|
+
|
22
|
+
libtool: $(LIBTOOL_DEPS)
|
23
|
+
$(SHELL) ./config.status --recheck
|
data/vendor/snappy/NEWS
ADDED
@@ -0,0 +1,128 @@
|
|
1
|
+
Snappy v1.1.2, February 28th 2014:
|
2
|
+
|
3
|
+
This is a maintenance release with no changes to the actual library
|
4
|
+
source code.
|
5
|
+
|
6
|
+
* Stop distributing benchmark data files that have unclear
|
7
|
+
or unsuitable licensing.
|
8
|
+
|
9
|
+
* Add support for padding chunks in the framing format.
|
10
|
+
|
11
|
+
|
12
|
+
Snappy v1.1.1, October 15th 2013:
|
13
|
+
|
14
|
+
* Add support for uncompressing to iovecs (scatter I/O).
|
15
|
+
The bulk of this patch was contributed by Mohit Aron.
|
16
|
+
|
17
|
+
* Speed up decompression by ~2%; much more so (~13-20%) on
|
18
|
+
a few benchmarks on given compilers and CPUs.
|
19
|
+
|
20
|
+
* Fix a few issues with MSVC compilation.
|
21
|
+
|
22
|
+
* Support truncated test data in the benchmark.
|
23
|
+
|
24
|
+
|
25
|
+
Snappy v1.1.0, January 18th 2013:
|
26
|
+
|
27
|
+
* Snappy now uses 64 kB block size instead of 32 kB. On average,
|
28
|
+
this means it compresses about 3% denser (more so for some
|
29
|
+
inputs), at the same or better speeds.
|
30
|
+
|
31
|
+
* libsnappy no longer depends on iostream.
|
32
|
+
|
33
|
+
* Some small performance improvements in compression on x86
|
34
|
+
(0.5–1%).
|
35
|
+
|
36
|
+
* Various portability fixes for ARM-based platforms, for MSVC,
|
37
|
+
and for GNU/Hurd.
|
38
|
+
|
39
|
+
|
40
|
+
Snappy v1.0.5, February 24th 2012:
|
41
|
+
|
42
|
+
* More speed improvements. Exactly how big will depend on
|
43
|
+
the architecture:
|
44
|
+
|
45
|
+
- 3–10% faster decompression for the base case (x86-64).
|
46
|
+
|
47
|
+
- ARMv7 and higher can now use unaligned accesses,
|
48
|
+
and will see about 30% faster decompression and
|
49
|
+
20–40% faster compression.
|
50
|
+
|
51
|
+
- 32-bit platforms (ARM and 32-bit x86) will see 2–5%
|
52
|
+
faster compression.
|
53
|
+
|
54
|
+
These are all cumulative (e.g., ARM gets all three speedups).
|
55
|
+
|
56
|
+
* Fixed an issue where the unit test would crash on system
|
57
|
+
with less than 256 MB address space available,
|
58
|
+
e.g. some embedded platforms.
|
59
|
+
|
60
|
+
* Added a framing format description, for use over e.g. HTTP,
|
61
|
+
or for a command-line compressor. We do not have any
|
62
|
+
implementations of this at the current point, but there seems
|
63
|
+
to be enough of a general interest in the topic.
|
64
|
+
Also make the format description slightly clearer.
|
65
|
+
|
66
|
+
* Remove some compile-time warnings in -Wall
|
67
|
+
(mostly signed/unsigned comparisons), for easier embedding
|
68
|
+
into projects that use -Wall -Werror.
|
69
|
+
|
70
|
+
|
71
|
+
Snappy v1.0.4, September 15th 2011:
|
72
|
+
|
73
|
+
* Speeded up the decompressor somewhat; typically about 2–8%
|
74
|
+
for Core i7, in 64-bit mode (comparable for Opteron).
|
75
|
+
Somewhat more for some tests, almost no gain for others.
|
76
|
+
|
77
|
+
* Make Snappy compile on certain platforms it didn't before
|
78
|
+
(Solaris with SunPro C++, HP-UX, AIX).
|
79
|
+
|
80
|
+
* Correct some minor errors in the format description.
|
81
|
+
|
82
|
+
|
83
|
+
Snappy v1.0.3, June 2nd 2011:
|
84
|
+
|
85
|
+
* Speeded up the decompressor somewhat; about 3-6% for Core 2,
|
86
|
+
6-13% for Core i7, and 5-12% for Opteron (all in 64-bit mode).
|
87
|
+
|
88
|
+
* Added compressed format documentation. This text is new,
|
89
|
+
but an earlier version from Zeev Tarantov was used as reference.
|
90
|
+
|
91
|
+
* Only link snappy_unittest against -lz and other autodetected
|
92
|
+
libraries, not libsnappy.so (which doesn't need any such dependency).
|
93
|
+
|
94
|
+
* Fixed some display issues in the microbenchmarks, one of which would
|
95
|
+
frequently make the test crash on GNU/Hurd.
|
96
|
+
|
97
|
+
|
98
|
+
Snappy v1.0.2, April 29th 2011:
|
99
|
+
|
100
|
+
* Relicense to a BSD-type license.
|
101
|
+
|
102
|
+
* Added C bindings, contributed by Martin Gieseking.
|
103
|
+
|
104
|
+
* More Win32 fixes, in particular for MSVC.
|
105
|
+
|
106
|
+
* Replace geo.protodata with a newer version.
|
107
|
+
|
108
|
+
* Fix timing inaccuracies in the unit test when comparing Snappy
|
109
|
+
to other algorithms.
|
110
|
+
|
111
|
+
|
112
|
+
Snappy v1.0.1, March 25th 2011:
|
113
|
+
|
114
|
+
This is a maintenance release, mostly containing minor fixes.
|
115
|
+
There is no new functionality. The most important fixes include:
|
116
|
+
|
117
|
+
* The COPYING file and all licensing headers now correctly state that
|
118
|
+
Snappy is licensed under the Apache 2.0 license.
|
119
|
+
|
120
|
+
* snappy_unittest should now compile natively under Windows,
|
121
|
+
as well as on embedded systems with no mmap().
|
122
|
+
|
123
|
+
* Various autotools nits have been fixed.
|
124
|
+
|
125
|
+
|
126
|
+
Snappy v1.0, March 17th 2011:
|
127
|
+
|
128
|
+
* Initial version.
|
@@ -0,0 +1,135 @@
|
|
1
|
+
Snappy, a fast compressor/decompressor.
|
2
|
+
|
3
|
+
|
4
|
+
Introduction
|
5
|
+
============
|
6
|
+
|
7
|
+
Snappy is a compression/decompression library. It does not aim for maximum
|
8
|
+
compression, or compatibility with any other compression library; instead,
|
9
|
+
it aims for very high speeds and reasonable compression. For instance,
|
10
|
+
compared to the fastest mode of zlib, Snappy is an order of magnitude faster
|
11
|
+
for most inputs, but the resulting compressed files are anywhere from 20% to
|
12
|
+
100% bigger. (For more information, see "Performance", below.)
|
13
|
+
|
14
|
+
Snappy has the following properties:
|
15
|
+
|
16
|
+
* Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code.
|
17
|
+
See "Performance" below.
|
18
|
+
* Stable: Over the last few years, Snappy has compressed and decompressed
|
19
|
+
petabytes of data in Google's production environment. The Snappy bitstream
|
20
|
+
format is stable and will not change between versions.
|
21
|
+
* Robust: The Snappy decompressor is designed not to crash in the face of
|
22
|
+
corrupted or malicious input.
|
23
|
+
* Free and open source software: Snappy is licensed under a BSD-type license.
|
24
|
+
For more information, see the included COPYING file.
|
25
|
+
|
26
|
+
Snappy has previously been called "Zippy" in some Google presentations
|
27
|
+
and the like.
|
28
|
+
|
29
|
+
|
30
|
+
Performance
|
31
|
+
===========
|
32
|
+
|
33
|
+
Snappy is intended to be fast. On a single core of a Core i7 processor
|
34
|
+
in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at
|
35
|
+
about 500 MB/sec or more. (These numbers are for the slowest inputs in our
|
36
|
+
benchmark suite; others are much faster.) In our tests, Snappy usually
|
37
|
+
is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ,
|
38
|
+
etc.) while achieving comparable compression ratios.
|
39
|
+
|
40
|
+
Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x
|
41
|
+
for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and
|
42
|
+
other already-compressed data. Similar numbers for zlib in its fastest mode
|
43
|
+
are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are
|
44
|
+
capable of achieving yet higher compression rates, although usually at the
|
45
|
+
expense of speed. Of course, compression ratio will vary significantly with
|
46
|
+
the input.
|
47
|
+
|
48
|
+
Although Snappy should be fairly portable, it is primarily optimized
|
49
|
+
for 64-bit x86-compatible processors, and may run slower in other environments.
|
50
|
+
In particular:
|
51
|
+
|
52
|
+
- Snappy uses 64-bit operations in several places to process more data at
|
53
|
+
once than would otherwise be possible.
|
54
|
+
- Snappy assumes unaligned 32- and 64-bit loads and stores are cheap.
|
55
|
+
On some platforms, these must be emulated with single-byte loads
|
56
|
+
and stores, which is much slower.
|
57
|
+
- Snappy assumes little-endian throughout, and needs to byte-swap data in
|
58
|
+
several places if running on a big-endian platform.
|
59
|
+
|
60
|
+
Experience has shown that even heavily tuned code can be improved.
|
61
|
+
Performance optimizations, whether for 64-bit x86 or other platforms,
|
62
|
+
are of course most welcome; see "Contact", below.
|
63
|
+
|
64
|
+
|
65
|
+
Usage
|
66
|
+
=====
|
67
|
+
|
68
|
+
Note that Snappy, both the implementation and the main interface,
|
69
|
+
is written in C++. However, several third-party bindings to other languages
|
70
|
+
are available; see the Google Code page at http://code.google.com/p/snappy/
|
71
|
+
for more information. Also, if you want to use Snappy from C code, you can
|
72
|
+
use the included C bindings in snappy-c.h.
|
73
|
+
|
74
|
+
To use Snappy from your own C++ program, include the file "snappy.h" from
|
75
|
+
your calling file, and link against the compiled library.
|
76
|
+
|
77
|
+
There are many ways to call Snappy, but the simplest possible is
|
78
|
+
|
79
|
+
snappy::Compress(input.data(), input.size(), &output);
|
80
|
+
|
81
|
+
and similarly
|
82
|
+
|
83
|
+
snappy::Uncompress(input.data(), input.size(), &output);
|
84
|
+
|
85
|
+
where "input" and "output" are both instances of std::string.
|
86
|
+
|
87
|
+
There are other interfaces that are more flexible in various ways, including
|
88
|
+
support for custom (non-array) input sources. See the header file for more
|
89
|
+
information.
|
90
|
+
|
91
|
+
|
92
|
+
Tests and benchmarks
|
93
|
+
====================
|
94
|
+
|
95
|
+
When you compile Snappy, snappy_unittest is compiled in addition to the
|
96
|
+
library itself. You do not need it to use the compressor from your own library,
|
97
|
+
but it contains several useful components for Snappy development.
|
98
|
+
|
99
|
+
First of all, it contains unit tests, verifying correctness on your machine in
|
100
|
+
various scenarios. If you want to change or optimize Snappy, please run the
|
101
|
+
tests to verify you have not broken anything. Note that if you have the
|
102
|
+
Google Test library installed, unit test behavior (especially failures) will be
|
103
|
+
significantly more user-friendly. You can find Google Test at
|
104
|
+
|
105
|
+
http://code.google.com/p/googletest/
|
106
|
+
|
107
|
+
You probably also want the gflags library for handling of command-line flags;
|
108
|
+
you can find it at
|
109
|
+
|
110
|
+
http://code.google.com/p/google-gflags/
|
111
|
+
|
112
|
+
In addition to the unit tests, snappy contains microbenchmarks used to
|
113
|
+
tune compression and decompression performance. These are automatically run
|
114
|
+
before the unit tests, but you can disable them using the flag
|
115
|
+
--run_microbenchmarks=false if you have gflags installed (otherwise you will
|
116
|
+
need to edit the source).
|
117
|
+
|
118
|
+
Finally, snappy can benchmark Snappy against a few other compression libraries
|
119
|
+
(zlib, LZO, LZF, FastLZ and QuickLZ), if they were detected at configure time.
|
120
|
+
To benchmark using a given file, give the compression algorithm you want to test
|
121
|
+
Snappy against (e.g. --zlib) and then a list of one or more file names on the
|
122
|
+
command line. The testdata/ directory contains the files used by the
|
123
|
+
microbenchmark, which should provide a reasonably balanced starting point for
|
124
|
+
benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they
|
125
|
+
are used to verify correctness in the presence of corrupted data in the unit
|
126
|
+
test.)
|
127
|
+
|
128
|
+
|
129
|
+
Contact
|
130
|
+
=======
|
131
|
+
|
132
|
+
Snappy is distributed through Google Code. For the latest version, a bug tracker,
|
133
|
+
and other information, see
|
134
|
+
|
135
|
+
http://code.google.com/p/snappy/
|
@@ -0,0 +1,133 @@
|
|
1
|
+
m4_define([snappy_major], [1])
|
2
|
+
m4_define([snappy_minor], [1])
|
3
|
+
m4_define([snappy_patchlevel], [2])
|
4
|
+
|
5
|
+
# Libtool shared library interface versions (current:revision:age)
|
6
|
+
# Update this value for every release! (A:B:C will map to foo.so.(A-C).C.B)
|
7
|
+
# http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
|
8
|
+
m4_define([snappy_ltversion], [3:1:2])
|
9
|
+
|
10
|
+
AC_INIT([snappy], [snappy_major.snappy_minor.snappy_patchlevel])
|
11
|
+
AC_CONFIG_MACRO_DIR([m4])
|
12
|
+
|
13
|
+
# These are flags passed to automake (though they look like gcc flags!)
|
14
|
+
AM_INIT_AUTOMAKE([-Wall])
|
15
|
+
|
16
|
+
LT_INIT
|
17
|
+
AC_SUBST([LIBTOOL_DEPS])
|
18
|
+
AC_PROG_CXX
|
19
|
+
AC_LANG([C++])
|
20
|
+
AC_C_BIGENDIAN
|
21
|
+
AC_TYPE_SIZE_T
|
22
|
+
AC_TYPE_SSIZE_T
|
23
|
+
AC_CHECK_HEADERS([stdint.h stddef.h sys/mman.h sys/resource.h windows.h byteswap.h sys/byteswap.h sys/endian.h sys/time.h])
|
24
|
+
|
25
|
+
# Don't use AC_FUNC_MMAP, as it checks for mappings of already-mapped memory,
|
26
|
+
# which we don't need (and does not exist on Windows).
|
27
|
+
AC_CHECK_FUNC([mmap])
|
28
|
+
|
29
|
+
GTEST_LIB_CHECK([], [true], [true # Ignore; we can live without it.])
|
30
|
+
|
31
|
+
AC_ARG_WITH([gflags],
|
32
|
+
[AS_HELP_STRING(
|
33
|
+
[--with-gflags],
|
34
|
+
[use Google Flags package to enhance the unit test @<:@default=check@:>@])],
|
35
|
+
[],
|
36
|
+
[with_gflags=check])
|
37
|
+
|
38
|
+
if test "x$with_gflags" != "xno"; then
|
39
|
+
PKG_CHECK_MODULES(
|
40
|
+
[gflags],
|
41
|
+
[libgflags],
|
42
|
+
[AC_DEFINE([HAVE_GFLAGS], [1], [Use the gflags package for command-line parsing.])],
|
43
|
+
[if test "x$with_gflags" != "xcheck"; then
|
44
|
+
AC_MSG_FAILURE([--with-gflags was given, but test for gflags failed])
|
45
|
+
fi])
|
46
|
+
fi
|
47
|
+
|
48
|
+
# See if we have __builtin_expect.
|
49
|
+
# TODO: Use AC_CACHE.
|
50
|
+
AC_MSG_CHECKING([if the compiler supports __builtin_expect])
|
51
|
+
|
52
|
+
AC_TRY_COMPILE(, [
|
53
|
+
return __builtin_expect(1, 1) ? 1 : 0
|
54
|
+
], [
|
55
|
+
snappy_have_builtin_expect=yes
|
56
|
+
AC_MSG_RESULT([yes])
|
57
|
+
], [
|
58
|
+
snappy_have_builtin_expect=no
|
59
|
+
AC_MSG_RESULT([no])
|
60
|
+
])
|
61
|
+
if test x$snappy_have_builtin_expect = xyes ; then
|
62
|
+
AC_DEFINE([HAVE_BUILTIN_EXPECT], [1], [Define to 1 if the compiler supports __builtin_expect.])
|
63
|
+
fi
|
64
|
+
|
65
|
+
# See if we have working count-trailing-zeros intrinsics.
|
66
|
+
# TODO: Use AC_CACHE.
|
67
|
+
AC_MSG_CHECKING([if the compiler supports __builtin_ctzll])
|
68
|
+
|
69
|
+
AC_TRY_COMPILE(, [
|
70
|
+
return (__builtin_ctzll(0x100000000LL) == 32) ? 1 : 0
|
71
|
+
], [
|
72
|
+
snappy_have_builtin_ctz=yes
|
73
|
+
AC_MSG_RESULT([yes])
|
74
|
+
], [
|
75
|
+
snappy_have_builtin_ctz=no
|
76
|
+
AC_MSG_RESULT([no])
|
77
|
+
])
|
78
|
+
if test x$snappy_have_builtin_ctz = xyes ; then
|
79
|
+
AC_DEFINE([HAVE_BUILTIN_CTZ], [1], [Define to 1 if the compiler supports __builtin_ctz and friends.])
|
80
|
+
fi
|
81
|
+
|
82
|
+
# Other compression libraries; the unit test can use these for comparison
|
83
|
+
# if they are available. If they are not found, just ignore.
|
84
|
+
UNITTEST_LIBS=""
|
85
|
+
AC_DEFUN([CHECK_EXT_COMPRESSION_LIB], [
|
86
|
+
AH_CHECK_LIB([$1])
|
87
|
+
AC_CHECK_LIB(
|
88
|
+
[$1],
|
89
|
+
[$2],
|
90
|
+
[
|
91
|
+
AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_LIB$1))
|
92
|
+
UNITTEST_LIBS="-l$1 $UNITTEST_LIBS"
|
93
|
+
],
|
94
|
+
[true]
|
95
|
+
)
|
96
|
+
])
|
97
|
+
CHECK_EXT_COMPRESSION_LIB([z], [zlibVersion])
|
98
|
+
CHECK_EXT_COMPRESSION_LIB([lzo2], [lzo1x_1_15_compress])
|
99
|
+
CHECK_EXT_COMPRESSION_LIB([lzf], [lzf_compress])
|
100
|
+
CHECK_EXT_COMPRESSION_LIB([fastlz], [fastlz_compress])
|
101
|
+
CHECK_EXT_COMPRESSION_LIB([quicklz], [qlz_compress])
|
102
|
+
AC_SUBST([UNITTEST_LIBS])
|
103
|
+
|
104
|
+
# These are used by snappy-stubs-public.h.in.
|
105
|
+
if test "$ac_cv_header_stdint_h" = "yes"; then
|
106
|
+
AC_SUBST([ac_cv_have_stdint_h], [1])
|
107
|
+
else
|
108
|
+
AC_SUBST([ac_cv_have_stdint_h], [0])
|
109
|
+
fi
|
110
|
+
if test "$ac_cv_header_stddef_h" = "yes"; then
|
111
|
+
AC_SUBST([ac_cv_have_stddef_h], [1])
|
112
|
+
else
|
113
|
+
AC_SUBST([ac_cv_have_stddef_h], [0])
|
114
|
+
fi
|
115
|
+
if test "$ac_cv_header_sys_uio_h" = "yes"; then
|
116
|
+
AC_SUBST([ac_cv_have_sys_uio_h], [1])
|
117
|
+
else
|
118
|
+
AC_SUBST([ac_cv_have_sys_uio_h], [0])
|
119
|
+
fi
|
120
|
+
|
121
|
+
# Export the version to snappy-stubs-public.h.
|
122
|
+
SNAPPY_MAJOR="snappy_major"
|
123
|
+
SNAPPY_MINOR="snappy_minor"
|
124
|
+
SNAPPY_PATCHLEVEL="snappy_patchlevel"
|
125
|
+
|
126
|
+
AC_SUBST([SNAPPY_MAJOR])
|
127
|
+
AC_SUBST([SNAPPY_MINOR])
|
128
|
+
AC_SUBST([SNAPPY_PATCHLEVEL])
|
129
|
+
AC_SUBST([SNAPPY_LTVERSION], snappy_ltversion)
|
130
|
+
|
131
|
+
AC_CONFIG_HEADERS([config.h])
|
132
|
+
AC_CONFIG_FILES([Makefile snappy-stubs-public.h])
|
133
|
+
AC_OUTPUT
|
@@ -0,0 +1,110 @@
|
|
1
|
+
Snappy compressed format description
|
2
|
+
Last revised: 2011-10-05
|
3
|
+
|
4
|
+
|
5
|
+
This is not a formal specification, but should suffice to explain most
|
6
|
+
relevant parts of how the Snappy format works. It is originally based on
|
7
|
+
text by Zeev Tarantov.
|
8
|
+
|
9
|
+
Snappy is a LZ77-type compressor with a fixed, byte-oriented encoding.
|
10
|
+
There is no entropy encoder backend nor framing layer -- the latter is
|
11
|
+
assumed to be handled by other parts of the system.
|
12
|
+
|
13
|
+
This document only describes the format, not how the Snappy compressor nor
|
14
|
+
decompressor actually works. The correctness of the decompressor should not
|
15
|
+
depend on implementation details of the compressor, and vice versa.
|
16
|
+
|
17
|
+
|
18
|
+
1. Preamble
|
19
|
+
|
20
|
+
The stream starts with the uncompressed length (up to a maximum of 2^32 - 1),
|
21
|
+
stored as a little-endian varint. Varints consist of a series of bytes,
|
22
|
+
where the lower 7 bits are data and the upper bit is set iff there are
|
23
|
+
more bytes to be read. In other words, an uncompressed length of 64 would
|
24
|
+
be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
|
25
|
+
would be stored as 0xFE 0xFF 0x7F.
|
26
|
+
|
27
|
+
|
28
|
+
2. The compressed stream itself
|
29
|
+
|
30
|
+
There are two types of elements in a Snappy stream: Literals and
|
31
|
+
copies (backreferences). There is no restriction on the order of elements,
|
32
|
+
except that the stream naturally cannot start with a copy. (Having
|
33
|
+
two literals in a row is never optimal from a compression point of
|
34
|
+
view, but nevertheless fully permitted.) Each element starts with a tag byte,
|
35
|
+
and the lower two bits of this tag byte signal what type of element will
|
36
|
+
follow:
|
37
|
+
|
38
|
+
00: Literal
|
39
|
+
01: Copy with 1-byte offset
|
40
|
+
10: Copy with 2-byte offset
|
41
|
+
11: Copy with 4-byte offset
|
42
|
+
|
43
|
+
The interpretation of the upper six bits are element-dependent.
|
44
|
+
|
45
|
+
|
46
|
+
2.1. Literals (00)
|
47
|
+
|
48
|
+
Literals are uncompressed data stored directly in the byte stream.
|
49
|
+
The literal length is stored differently depending on the length
|
50
|
+
of the literal:
|
51
|
+
|
52
|
+
- For literals up to and including 60 bytes in length, the upper
|
53
|
+
six bits of the tag byte contain (len-1). The literal follows
|
54
|
+
immediately thereafter in the bytestream.
|
55
|
+
- For longer literals, the (len-1) value is stored after the tag byte,
|
56
|
+
little-endian. The upper six bits of the tag byte describe how
|
57
|
+
many bytes are used for the length; 60, 61, 62 or 63 for
|
58
|
+
1-4 bytes, respectively. The literal itself follows after the
|
59
|
+
length.
|
60
|
+
|
61
|
+
|
62
|
+
2.2. Copies
|
63
|
+
|
64
|
+
Copies are references back into previous decompressed data, telling
|
65
|
+
the decompressor to reuse data it has previously decoded.
|
66
|
+
They encode two values: The _offset_, saying how many bytes back
|
67
|
+
from the current position to read, and the _length_, how many bytes
|
68
|
+
to copy. Offsets of zero can be encoded, but are not legal;
|
69
|
+
similarly, it is possible to encode backreferences that would
|
70
|
+
go past the end of the block (offset > current decompressed position),
|
71
|
+
which is also nonsensical and thus not allowed.
|
72
|
+
|
73
|
+
As in most LZ77-based compressors, the length can be larger than the offset,
|
74
|
+
yielding a form of run-length encoding (RLE). For instance,
|
75
|
+
"xababab" could be encoded as
|
76
|
+
|
77
|
+
<literal: "xab"> <copy: offset=2 length=4>
|
78
|
+
|
79
|
+
Note that since the current Snappy compressor works in 32 kB
|
80
|
+
blocks and does not do matching across blocks, it will never produce
|
81
|
+
a bitstream with offsets larger than about 32768. However, the
|
82
|
+
decompressor should not rely on this, as it may change in the future.
|
83
|
+
|
84
|
+
There are several different kinds of copy elements, depending on
|
85
|
+
the amount of bytes to be copied (length), and how far back the
|
86
|
+
data to be copied is (offset).
|
87
|
+
|
88
|
+
|
89
|
+
2.2.1. Copy with 1-byte offset (01)
|
90
|
+
|
91
|
+
These elements can encode lengths between [4..11] bytes and offsets
|
92
|
+
between [0..2047] bytes. (len-4) occupies three bits and is stored
|
93
|
+
in bits [2..4] of the tag byte. The offset occupies 11 bits, of which the
|
94
|
+
upper three are stored in the upper three bits ([5..7]) of the tag byte,
|
95
|
+
and the lower eight are stored in a byte following the tag byte.
|
96
|
+
|
97
|
+
|
98
|
+
2.2.2. Copy with 2-byte offset (10)
|
99
|
+
|
100
|
+
These elements can encode lengths between [1..64] and offsets from
|
101
|
+
[0..65535]. (len-1) occupies six bits and is stored in the upper
|
102
|
+
six bits ([2..7]) of the tag byte. The offset is stored as a
|
103
|
+
little-endian 16-bit integer in the two bytes following the tag byte.
|
104
|
+
|
105
|
+
|
106
|
+
2.2.3. Copy with 4-byte offset (11)
|
107
|
+
|
108
|
+
These are like the copies with 2-byte offsets (see previous subsection),
|
109
|
+
except that the offset is stored as a 32-bit integer instead of a
|
110
|
+
16-bit integer (and thus will occupy four bytes).
|