snappy 0.0.13 → 0.0.14

Sign up to get free protection for your applications and to get access to all the features.
Files changed (42) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile +1 -1
  3. data/lib/snappy/version.rb +1 -1
  4. data/vendor/snappy/AUTHORS +1 -0
  5. data/vendor/snappy/COPYING +54 -0
  6. data/vendor/snappy/ChangeLog +1916 -0
  7. data/vendor/snappy/Makefile.am +23 -0
  8. data/vendor/snappy/NEWS +128 -0
  9. data/vendor/snappy/README +135 -0
  10. data/vendor/snappy/autogen.sh +7 -0
  11. data/vendor/snappy/configure.ac +133 -0
  12. data/vendor/snappy/format_description.txt +110 -0
  13. data/vendor/snappy/framing_format.txt +135 -0
  14. data/vendor/snappy/m4/gtest.m4 +74 -0
  15. data/vendor/snappy/snappy-c.cc +90 -0
  16. data/vendor/snappy/snappy-c.h +138 -0
  17. data/vendor/snappy/snappy-internal.h +150 -0
  18. data/vendor/snappy/snappy-sinksource.cc +71 -0
  19. data/vendor/snappy/snappy-sinksource.h +137 -0
  20. data/vendor/snappy/snappy-stubs-internal.cc +42 -0
  21. data/vendor/snappy/snappy-stubs-internal.h +491 -0
  22. data/vendor/snappy/snappy-stubs-public.h.in +98 -0
  23. data/vendor/snappy/snappy-test.cc +606 -0
  24. data/vendor/snappy/snappy-test.h +582 -0
  25. data/vendor/snappy/snappy.cc +1306 -0
  26. data/vendor/snappy/snappy.h +184 -0
  27. data/vendor/snappy/snappy_unittest.cc +1355 -0
  28. data/vendor/snappy/testdata/alice29.txt +3609 -0
  29. data/vendor/snappy/testdata/asyoulik.txt +4122 -0
  30. data/vendor/snappy/testdata/baddata1.snappy +0 -0
  31. data/vendor/snappy/testdata/baddata2.snappy +0 -0
  32. data/vendor/snappy/testdata/baddata3.snappy +0 -0
  33. data/vendor/snappy/testdata/fireworks.jpeg +0 -0
  34. data/vendor/snappy/testdata/geo.protodata +0 -0
  35. data/vendor/snappy/testdata/html +1 -0
  36. data/vendor/snappy/testdata/html_x_4 +1 -0
  37. data/vendor/snappy/testdata/kppkn.gtb +0 -0
  38. data/vendor/snappy/testdata/lcet10.txt +7519 -0
  39. data/vendor/snappy/testdata/paper-100k.pdf +600 -2
  40. data/vendor/snappy/testdata/plrabn12.txt +10699 -0
  41. data/vendor/snappy/testdata/urls.10K +10000 -0
  42. metadata +40 -2
@@ -0,0 +1,23 @@
1
+ ACLOCAL_AMFLAGS = -I m4
2
+
3
+ # Library.
4
+ lib_LTLIBRARIES = libsnappy.la
5
+ libsnappy_la_SOURCES = snappy.cc snappy-sinksource.cc snappy-stubs-internal.cc snappy-c.cc
6
+ libsnappy_la_LDFLAGS = -version-info $(SNAPPY_LTVERSION)
7
+
8
+ include_HEADERS = snappy.h snappy-sinksource.h snappy-stubs-public.h snappy-c.h
9
+ noinst_HEADERS = snappy-internal.h snappy-stubs-internal.h snappy-test.h
10
+
11
+ # Unit tests and benchmarks.
12
+ snappy_unittest_CPPFLAGS = $(gflags_CFLAGS) $(GTEST_CPPFLAGS)
13
+ snappy_unittest_SOURCES = snappy_unittest.cc snappy-test.cc
14
+ snappy_unittest_LDFLAGS = $(GTEST_LDFLAGS)
15
+ snappy_unittest_LDADD = libsnappy.la $(UNITTEST_LIBS) $(gflags_LIBS) $(GTEST_LIBS)
16
+ TESTS = snappy_unittest
17
+ noinst_PROGRAMS = $(TESTS)
18
+
19
+ EXTRA_DIST = autogen.sh testdata/alice29.txt testdata/asyoulik.txt testdata/baddata1.snappy testdata/baddata2.snappy testdata/baddata3.snappy testdata/geo.protodata testdata/fireworks.jpeg testdata/html testdata/html_x_4 testdata/kppkn.gtb testdata/lcet10.txt testdata/paper-100k.pdf testdata/plrabn12.txt testdata/urls.10K
20
+ dist_doc_DATA = ChangeLog COPYING INSTALL NEWS README format_description.txt framing_format.txt
21
+
22
+ libtool: $(LIBTOOL_DEPS)
23
+ $(SHELL) ./config.status --recheck
@@ -0,0 +1,128 @@
1
+ Snappy v1.1.2, February 28th 2014:
2
+
3
+ This is a maintenance release with no changes to the actual library
4
+ source code.
5
+
6
+ * Stop distributing benchmark data files that have unclear
7
+ or unsuitable licensing.
8
+
9
+ * Add support for padding chunks in the framing format.
10
+
11
+
12
+ Snappy v1.1.1, October 15th 2013:
13
+
14
+ * Add support for uncompressing to iovecs (scatter I/O).
15
+ The bulk of this patch was contributed by Mohit Aron.
16
+
17
+ * Speed up decompression by ~2%; much more so (~13-20%) on
18
+ a few benchmarks on given compilers and CPUs.
19
+
20
+ * Fix a few issues with MSVC compilation.
21
+
22
+ * Support truncated test data in the benchmark.
23
+
24
+
25
+ Snappy v1.1.0, January 18th 2013:
26
+
27
+ * Snappy now uses 64 kB block size instead of 32 kB. On average,
28
+ this means it compresses about 3% denser (more so for some
29
+ inputs), at the same or better speeds.
30
+
31
+ * libsnappy no longer depends on iostream.
32
+
33
+ * Some small performance improvements in compression on x86
34
+ (0.5–1%).
35
+
36
+ * Various portability fixes for ARM-based platforms, for MSVC,
37
+ and for GNU/Hurd.
38
+
39
+
40
+ Snappy v1.0.5, February 24th 2012:
41
+
42
+ * More speed improvements. Exactly how big will depend on
43
+ the architecture:
44
+
45
+ - 3–10% faster decompression for the base case (x86-64).
46
+
47
+ - ARMv7 and higher can now use unaligned accesses,
48
+ and will see about 30% faster decompression and
49
+ 20–40% faster compression.
50
+
51
+ - 32-bit platforms (ARM and 32-bit x86) will see 2–5%
52
+ faster compression.
53
+
54
+ These are all cumulative (e.g., ARM gets all three speedups).
55
+
56
+ * Fixed an issue where the unit test would crash on system
57
+ with less than 256 MB address space available,
58
+ e.g. some embedded platforms.
59
+
60
+ * Added a framing format description, for use over e.g. HTTP,
61
+ or for a command-line compressor. We do not have any
62
+ implementations of this at the current point, but there seems
63
+ to be enough of a general interest in the topic.
64
+ Also make the format description slightly clearer.
65
+
66
+ * Remove some compile-time warnings in -Wall
67
+ (mostly signed/unsigned comparisons), for easier embedding
68
+ into projects that use -Wall -Werror.
69
+
70
+
71
+ Snappy v1.0.4, September 15th 2011:
72
+
73
+ * Speeded up the decompressor somewhat; typically about 2–8%
74
+ for Core i7, in 64-bit mode (comparable for Opteron).
75
+ Somewhat more for some tests, almost no gain for others.
76
+
77
+ * Make Snappy compile on certain platforms it didn't before
78
+ (Solaris with SunPro C++, HP-UX, AIX).
79
+
80
+ * Correct some minor errors in the format description.
81
+
82
+
83
+ Snappy v1.0.3, June 2nd 2011:
84
+
85
+ * Speeded up the decompressor somewhat; about 3-6% for Core 2,
86
+ 6-13% for Core i7, and 5-12% for Opteron (all in 64-bit mode).
87
+
88
+ * Added compressed format documentation. This text is new,
89
+ but an earlier version from Zeev Tarantov was used as reference.
90
+
91
+ * Only link snappy_unittest against -lz and other autodetected
92
+ libraries, not libsnappy.so (which doesn't need any such dependency).
93
+
94
+ * Fixed some display issues in the microbenchmarks, one of which would
95
+ frequently make the test crash on GNU/Hurd.
96
+
97
+
98
+ Snappy v1.0.2, April 29th 2011:
99
+
100
+ * Relicense to a BSD-type license.
101
+
102
+ * Added C bindings, contributed by Martin Gieseking.
103
+
104
+ * More Win32 fixes, in particular for MSVC.
105
+
106
+ * Replace geo.protodata with a newer version.
107
+
108
+ * Fix timing inaccuracies in the unit test when comparing Snappy
109
+ to other algorithms.
110
+
111
+
112
+ Snappy v1.0.1, March 25th 2011:
113
+
114
+ This is a maintenance release, mostly containing minor fixes.
115
+ There is no new functionality. The most important fixes include:
116
+
117
+ * The COPYING file and all licensing headers now correctly state that
118
+ Snappy is licensed under the Apache 2.0 license.
119
+
120
+ * snappy_unittest should now compile natively under Windows,
121
+ as well as on embedded systems with no mmap().
122
+
123
+ * Various autotools nits have been fixed.
124
+
125
+
126
+ Snappy v1.0, March 17th 2011:
127
+
128
+ * Initial version.
@@ -0,0 +1,135 @@
1
+ Snappy, a fast compressor/decompressor.
2
+
3
+
4
+ Introduction
5
+ ============
6
+
7
+ Snappy is a compression/decompression library. It does not aim for maximum
8
+ compression, or compatibility with any other compression library; instead,
9
+ it aims for very high speeds and reasonable compression. For instance,
10
+ compared to the fastest mode of zlib, Snappy is an order of magnitude faster
11
+ for most inputs, but the resulting compressed files are anywhere from 20% to
12
+ 100% bigger. (For more information, see "Performance", below.)
13
+
14
+ Snappy has the following properties:
15
+
16
+ * Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code.
17
+ See "Performance" below.
18
+ * Stable: Over the last few years, Snappy has compressed and decompressed
19
+ petabytes of data in Google's production environment. The Snappy bitstream
20
+ format is stable and will not change between versions.
21
+ * Robust: The Snappy decompressor is designed not to crash in the face of
22
+ corrupted or malicious input.
23
+ * Free and open source software: Snappy is licensed under a BSD-type license.
24
+ For more information, see the included COPYING file.
25
+
26
+ Snappy has previously been called "Zippy" in some Google presentations
27
+ and the like.
28
+
29
+
30
+ Performance
31
+ ===========
32
+
33
+ Snappy is intended to be fast. On a single core of a Core i7 processor
34
+ in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at
35
+ about 500 MB/sec or more. (These numbers are for the slowest inputs in our
36
+ benchmark suite; others are much faster.) In our tests, Snappy usually
37
+ is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ,
38
+ etc.) while achieving comparable compression ratios.
39
+
40
+ Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x
41
+ for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and
42
+ other already-compressed data. Similar numbers for zlib in its fastest mode
43
+ are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are
44
+ capable of achieving yet higher compression rates, although usually at the
45
+ expense of speed. Of course, compression ratio will vary significantly with
46
+ the input.
47
+
48
+ Although Snappy should be fairly portable, it is primarily optimized
49
+ for 64-bit x86-compatible processors, and may run slower in other environments.
50
+ In particular:
51
+
52
+ - Snappy uses 64-bit operations in several places to process more data at
53
+ once than would otherwise be possible.
54
+ - Snappy assumes unaligned 32- and 64-bit loads and stores are cheap.
55
+ On some platforms, these must be emulated with single-byte loads
56
+ and stores, which is much slower.
57
+ - Snappy assumes little-endian throughout, and needs to byte-swap data in
58
+ several places if running on a big-endian platform.
59
+
60
+ Experience has shown that even heavily tuned code can be improved.
61
+ Performance optimizations, whether for 64-bit x86 or other platforms,
62
+ are of course most welcome; see "Contact", below.
63
+
64
+
65
+ Usage
66
+ =====
67
+
68
+ Note that Snappy, both the implementation and the main interface,
69
+ is written in C++. However, several third-party bindings to other languages
70
+ are available; see the Google Code page at http://code.google.com/p/snappy/
71
+ for more information. Also, if you want to use Snappy from C code, you can
72
+ use the included C bindings in snappy-c.h.
73
+
74
+ To use Snappy from your own C++ program, include the file "snappy.h" from
75
+ your calling file, and link against the compiled library.
76
+
77
+ There are many ways to call Snappy, but the simplest possible is
78
+
79
+ snappy::Compress(input.data(), input.size(), &output);
80
+
81
+ and similarly
82
+
83
+ snappy::Uncompress(input.data(), input.size(), &output);
84
+
85
+ where "input" and "output" are both instances of std::string.
86
+
87
+ There are other interfaces that are more flexible in various ways, including
88
+ support for custom (non-array) input sources. See the header file for more
89
+ information.
90
+
91
+
92
+ Tests and benchmarks
93
+ ====================
94
+
95
+ When you compile Snappy, snappy_unittest is compiled in addition to the
96
+ library itself. You do not need it to use the compressor from your own library,
97
+ but it contains several useful components for Snappy development.
98
+
99
+ First of all, it contains unit tests, verifying correctness on your machine in
100
+ various scenarios. If you want to change or optimize Snappy, please run the
101
+ tests to verify you have not broken anything. Note that if you have the
102
+ Google Test library installed, unit test behavior (especially failures) will be
103
+ significantly more user-friendly. You can find Google Test at
104
+
105
+ http://code.google.com/p/googletest/
106
+
107
+ You probably also want the gflags library for handling of command-line flags;
108
+ you can find it at
109
+
110
+ http://code.google.com/p/google-gflags/
111
+
112
+ In addition to the unit tests, snappy contains microbenchmarks used to
113
+ tune compression and decompression performance. These are automatically run
114
+ before the unit tests, but you can disable them using the flag
115
+ --run_microbenchmarks=false if you have gflags installed (otherwise you will
116
+ need to edit the source).
117
+
118
+ Finally, snappy can benchmark Snappy against a few other compression libraries
119
+ (zlib, LZO, LZF, FastLZ and QuickLZ), if they were detected at configure time.
120
+ To benchmark using a given file, give the compression algorithm you want to test
121
+ Snappy against (e.g. --zlib) and then a list of one or more file names on the
122
+ command line. The testdata/ directory contains the files used by the
123
+ microbenchmark, which should provide a reasonably balanced starting point for
124
+ benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they
125
+ are used to verify correctness in the presence of corrupted data in the unit
126
+ test.)
127
+
128
+
129
+ Contact
130
+ =======
131
+
132
+ Snappy is distributed through Google Code. For the latest version, a bug tracker,
133
+ and other information, see
134
+
135
+ http://code.google.com/p/snappy/
@@ -0,0 +1,7 @@
1
+ #! /bin/sh -e
2
+ rm -rf autom4te.cache
3
+ aclocal -I m4
4
+ autoheader
5
+ libtoolize --copy
6
+ automake --add-missing --copy
7
+ autoconf
@@ -0,0 +1,133 @@
1
+ m4_define([snappy_major], [1])
2
+ m4_define([snappy_minor], [1])
3
+ m4_define([snappy_patchlevel], [2])
4
+
5
+ # Libtool shared library interface versions (current:revision:age)
6
+ # Update this value for every release! (A:B:C will map to foo.so.(A-C).C.B)
7
+ # http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
8
+ m4_define([snappy_ltversion], [3:1:2])
9
+
10
+ AC_INIT([snappy], [snappy_major.snappy_minor.snappy_patchlevel])
11
+ AC_CONFIG_MACRO_DIR([m4])
12
+
13
+ # These are flags passed to automake (though they look like gcc flags!)
14
+ AM_INIT_AUTOMAKE([-Wall])
15
+
16
+ LT_INIT
17
+ AC_SUBST([LIBTOOL_DEPS])
18
+ AC_PROG_CXX
19
+ AC_LANG([C++])
20
+ AC_C_BIGENDIAN
21
+ AC_TYPE_SIZE_T
22
+ AC_TYPE_SSIZE_T
23
+ AC_CHECK_HEADERS([stdint.h stddef.h sys/mman.h sys/resource.h windows.h byteswap.h sys/byteswap.h sys/endian.h sys/time.h])
24
+
25
+ # Don't use AC_FUNC_MMAP, as it checks for mappings of already-mapped memory,
26
+ # which we don't need (and does not exist on Windows).
27
+ AC_CHECK_FUNC([mmap])
28
+
29
+ GTEST_LIB_CHECK([], [true], [true # Ignore; we can live without it.])
30
+
31
+ AC_ARG_WITH([gflags],
32
+ [AS_HELP_STRING(
33
+ [--with-gflags],
34
+ [use Google Flags package to enhance the unit test @<:@default=check@:>@])],
35
+ [],
36
+ [with_gflags=check])
37
+
38
+ if test "x$with_gflags" != "xno"; then
39
+ PKG_CHECK_MODULES(
40
+ [gflags],
41
+ [libgflags],
42
+ [AC_DEFINE([HAVE_GFLAGS], [1], [Use the gflags package for command-line parsing.])],
43
+ [if test "x$with_gflags" != "xcheck"; then
44
+ AC_MSG_FAILURE([--with-gflags was given, but test for gflags failed])
45
+ fi])
46
+ fi
47
+
48
+ # See if we have __builtin_expect.
49
+ # TODO: Use AC_CACHE.
50
+ AC_MSG_CHECKING([if the compiler supports __builtin_expect])
51
+
52
+ AC_TRY_COMPILE(, [
53
+ return __builtin_expect(1, 1) ? 1 : 0
54
+ ], [
55
+ snappy_have_builtin_expect=yes
56
+ AC_MSG_RESULT([yes])
57
+ ], [
58
+ snappy_have_builtin_expect=no
59
+ AC_MSG_RESULT([no])
60
+ ])
61
+ if test x$snappy_have_builtin_expect = xyes ; then
62
+ AC_DEFINE([HAVE_BUILTIN_EXPECT], [1], [Define to 1 if the compiler supports __builtin_expect.])
63
+ fi
64
+
65
+ # See if we have working count-trailing-zeros intrinsics.
66
+ # TODO: Use AC_CACHE.
67
+ AC_MSG_CHECKING([if the compiler supports __builtin_ctzll])
68
+
69
+ AC_TRY_COMPILE(, [
70
+ return (__builtin_ctzll(0x100000000LL) == 32) ? 1 : 0
71
+ ], [
72
+ snappy_have_builtin_ctz=yes
73
+ AC_MSG_RESULT([yes])
74
+ ], [
75
+ snappy_have_builtin_ctz=no
76
+ AC_MSG_RESULT([no])
77
+ ])
78
+ if test x$snappy_have_builtin_ctz = xyes ; then
79
+ AC_DEFINE([HAVE_BUILTIN_CTZ], [1], [Define to 1 if the compiler supports __builtin_ctz and friends.])
80
+ fi
81
+
82
+ # Other compression libraries; the unit test can use these for comparison
83
+ # if they are available. If they are not found, just ignore.
84
+ UNITTEST_LIBS=""
85
+ AC_DEFUN([CHECK_EXT_COMPRESSION_LIB], [
86
+ AH_CHECK_LIB([$1])
87
+ AC_CHECK_LIB(
88
+ [$1],
89
+ [$2],
90
+ [
91
+ AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_LIB$1))
92
+ UNITTEST_LIBS="-l$1 $UNITTEST_LIBS"
93
+ ],
94
+ [true]
95
+ )
96
+ ])
97
+ CHECK_EXT_COMPRESSION_LIB([z], [zlibVersion])
98
+ CHECK_EXT_COMPRESSION_LIB([lzo2], [lzo1x_1_15_compress])
99
+ CHECK_EXT_COMPRESSION_LIB([lzf], [lzf_compress])
100
+ CHECK_EXT_COMPRESSION_LIB([fastlz], [fastlz_compress])
101
+ CHECK_EXT_COMPRESSION_LIB([quicklz], [qlz_compress])
102
+ AC_SUBST([UNITTEST_LIBS])
103
+
104
+ # These are used by snappy-stubs-public.h.in.
105
+ if test "$ac_cv_header_stdint_h" = "yes"; then
106
+ AC_SUBST([ac_cv_have_stdint_h], [1])
107
+ else
108
+ AC_SUBST([ac_cv_have_stdint_h], [0])
109
+ fi
110
+ if test "$ac_cv_header_stddef_h" = "yes"; then
111
+ AC_SUBST([ac_cv_have_stddef_h], [1])
112
+ else
113
+ AC_SUBST([ac_cv_have_stddef_h], [0])
114
+ fi
115
+ if test "$ac_cv_header_sys_uio_h" = "yes"; then
116
+ AC_SUBST([ac_cv_have_sys_uio_h], [1])
117
+ else
118
+ AC_SUBST([ac_cv_have_sys_uio_h], [0])
119
+ fi
120
+
121
+ # Export the version to snappy-stubs-public.h.
122
+ SNAPPY_MAJOR="snappy_major"
123
+ SNAPPY_MINOR="snappy_minor"
124
+ SNAPPY_PATCHLEVEL="snappy_patchlevel"
125
+
126
+ AC_SUBST([SNAPPY_MAJOR])
127
+ AC_SUBST([SNAPPY_MINOR])
128
+ AC_SUBST([SNAPPY_PATCHLEVEL])
129
+ AC_SUBST([SNAPPY_LTVERSION], snappy_ltversion)
130
+
131
+ AC_CONFIG_HEADERS([config.h])
132
+ AC_CONFIG_FILES([Makefile snappy-stubs-public.h])
133
+ AC_OUTPUT
@@ -0,0 +1,110 @@
1
+ Snappy compressed format description
2
+ Last revised: 2011-10-05
3
+
4
+
5
+ This is not a formal specification, but should suffice to explain most
6
+ relevant parts of how the Snappy format works. It is originally based on
7
+ text by Zeev Tarantov.
8
+
9
+ Snappy is a LZ77-type compressor with a fixed, byte-oriented encoding.
10
+ There is no entropy encoder backend nor framing layer -- the latter is
11
+ assumed to be handled by other parts of the system.
12
+
13
+ This document only describes the format, not how the Snappy compressor nor
14
+ decompressor actually works. The correctness of the decompressor should not
15
+ depend on implementation details of the compressor, and vice versa.
16
+
17
+
18
+ 1. Preamble
19
+
20
+ The stream starts with the uncompressed length (up to a maximum of 2^32 - 1),
21
+ stored as a little-endian varint. Varints consist of a series of bytes,
22
+ where the lower 7 bits are data and the upper bit is set iff there are
23
+ more bytes to be read. In other words, an uncompressed length of 64 would
24
+ be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
25
+ would be stored as 0xFE 0xFF 0x7F.
26
+
27
+
28
+ 2. The compressed stream itself
29
+
30
+ There are two types of elements in a Snappy stream: Literals and
31
+ copies (backreferences). There is no restriction on the order of elements,
32
+ except that the stream naturally cannot start with a copy. (Having
33
+ two literals in a row is never optimal from a compression point of
34
+ view, but nevertheless fully permitted.) Each element starts with a tag byte,
35
+ and the lower two bits of this tag byte signal what type of element will
36
+ follow:
37
+
38
+ 00: Literal
39
+ 01: Copy with 1-byte offset
40
+ 10: Copy with 2-byte offset
41
+ 11: Copy with 4-byte offset
42
+
43
+ The interpretation of the upper six bits are element-dependent.
44
+
45
+
46
+ 2.1. Literals (00)
47
+
48
+ Literals are uncompressed data stored directly in the byte stream.
49
+ The literal length is stored differently depending on the length
50
+ of the literal:
51
+
52
+ - For literals up to and including 60 bytes in length, the upper
53
+ six bits of the tag byte contain (len-1). The literal follows
54
+ immediately thereafter in the bytestream.
55
+ - For longer literals, the (len-1) value is stored after the tag byte,
56
+ little-endian. The upper six bits of the tag byte describe how
57
+ many bytes are used for the length; 60, 61, 62 or 63 for
58
+ 1-4 bytes, respectively. The literal itself follows after the
59
+ length.
60
+
61
+
62
+ 2.2. Copies
63
+
64
+ Copies are references back into previous decompressed data, telling
65
+ the decompressor to reuse data it has previously decoded.
66
+ They encode two values: The _offset_, saying how many bytes back
67
+ from the current position to read, and the _length_, how many bytes
68
+ to copy. Offsets of zero can be encoded, but are not legal;
69
+ similarly, it is possible to encode backreferences that would
70
+ go past the end of the block (offset > current decompressed position),
71
+ which is also nonsensical and thus not allowed.
72
+
73
+ As in most LZ77-based compressors, the length can be larger than the offset,
74
+ yielding a form of run-length encoding (RLE). For instance,
75
+ "xababab" could be encoded as
76
+
77
+ <literal: "xab"> <copy: offset=2 length=4>
78
+
79
+ Note that since the current Snappy compressor works in 32 kB
80
+ blocks and does not do matching across blocks, it will never produce
81
+ a bitstream with offsets larger than about 32768. However, the
82
+ decompressor should not rely on this, as it may change in the future.
83
+
84
+ There are several different kinds of copy elements, depending on
85
+ the amount of bytes to be copied (length), and how far back the
86
+ data to be copied is (offset).
87
+
88
+
89
+ 2.2.1. Copy with 1-byte offset (01)
90
+
91
+ These elements can encode lengths between [4..11] bytes and offsets
92
+ between [0..2047] bytes. (len-4) occupies three bits and is stored
93
+ in bits [2..4] of the tag byte. The offset occupies 11 bits, of which the
94
+ upper three are stored in the upper three bits ([5..7]) of the tag byte,
95
+ and the lower eight are stored in a byte following the tag byte.
96
+
97
+
98
+ 2.2.2. Copy with 2-byte offset (10)
99
+
100
+ These elements can encode lengths between [1..64] and offsets from
101
+ [0..65535]. (len-1) occupies six bits and is stored in the upper
102
+ six bits ([2..7]) of the tag byte. The offset is stored as a
103
+ little-endian 16-bit integer in the two bytes following the tag byte.
104
+
105
+
106
+ 2.2.3. Copy with 4-byte offset (11)
107
+
108
+ These are like the copies with 2-byte offsets (see previous subsection),
109
+ except that the offset is stored as a 32-bit integer instead of a
110
+ 16-bit integer (and thus will occupy four bytes).