snappy 0.0.10-java → 0.0.11-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. checksums.yaml +4 -4
  2. data/.gitmodules +3 -0
  3. data/Rakefile +12 -13
  4. data/ext/extconf.rb +22 -31
  5. data/lib/snappy/reader.rb +10 -7
  6. data/lib/snappy/version.rb +1 -1
  7. data/snappy.gemspec +24 -0
  8. data/test/test-snappy-reader.rb +16 -0
  9. data/vendor/snappy/AUTHORS +1 -0
  10. data/vendor/snappy/COPYING +54 -0
  11. data/vendor/snappy/ChangeLog +1916 -0
  12. data/vendor/snappy/Makefile.am +23 -0
  13. data/vendor/snappy/NEWS +128 -0
  14. data/vendor/snappy/README +135 -0
  15. data/vendor/snappy/autogen.sh +7 -0
  16. data/vendor/snappy/configure.ac +133 -0
  17. data/vendor/snappy/format_description.txt +110 -0
  18. data/vendor/snappy/framing_format.txt +135 -0
  19. data/vendor/snappy/m4/gtest.m4 +74 -0
  20. data/vendor/snappy/snappy-c.cc +90 -0
  21. data/vendor/snappy/snappy-c.h +138 -0
  22. data/vendor/snappy/snappy-internal.h +150 -0
  23. data/vendor/snappy/snappy-sinksource.cc +71 -0
  24. data/vendor/snappy/snappy-sinksource.h +137 -0
  25. data/vendor/snappy/snappy-stubs-internal.cc +42 -0
  26. data/vendor/snappy/snappy-stubs-internal.h +491 -0
  27. data/vendor/snappy/snappy-stubs-public.h.in +98 -0
  28. data/vendor/snappy/snappy-test.cc +606 -0
  29. data/vendor/snappy/snappy-test.h +582 -0
  30. data/vendor/snappy/snappy.cc +1306 -0
  31. data/vendor/snappy/snappy.h +184 -0
  32. data/vendor/snappy/snappy_unittest.cc +1355 -0
  33. data/vendor/snappy/testdata/alice29.txt +3609 -0
  34. data/vendor/snappy/testdata/asyoulik.txt +4122 -0
  35. data/vendor/snappy/testdata/baddata1.snappy +0 -0
  36. data/vendor/snappy/testdata/baddata2.snappy +0 -0
  37. data/vendor/snappy/testdata/baddata3.snappy +0 -0
  38. data/vendor/snappy/testdata/fireworks.jpeg +0 -0
  39. data/vendor/snappy/testdata/geo.protodata +0 -0
  40. data/vendor/snappy/testdata/html +1 -0
  41. data/vendor/snappy/testdata/html_x_4 +1 -0
  42. data/vendor/snappy/testdata/kppkn.gtb +0 -0
  43. data/vendor/snappy/testdata/lcet10.txt +7519 -0
  44. data/vendor/snappy/testdata/paper-100k.pdf +600 -2
  45. data/vendor/snappy/testdata/plrabn12.txt +10699 -0
  46. data/vendor/snappy/testdata/urls.10K +10000 -0
  47. metadata +57 -18
@@ -0,0 +1,23 @@
1
+ ACLOCAL_AMFLAGS = -I m4
2
+
3
+ # Library.
4
+ lib_LTLIBRARIES = libsnappy.la
5
+ libsnappy_la_SOURCES = snappy.cc snappy-sinksource.cc snappy-stubs-internal.cc snappy-c.cc
6
+ libsnappy_la_LDFLAGS = -version-info $(SNAPPY_LTVERSION)
7
+
8
+ include_HEADERS = snappy.h snappy-sinksource.h snappy-stubs-public.h snappy-c.h
9
+ noinst_HEADERS = snappy-internal.h snappy-stubs-internal.h snappy-test.h
10
+
11
+ # Unit tests and benchmarks.
12
+ snappy_unittest_CPPFLAGS = $(gflags_CFLAGS) $(GTEST_CPPFLAGS)
13
+ snappy_unittest_SOURCES = snappy_unittest.cc snappy-test.cc
14
+ snappy_unittest_LDFLAGS = $(GTEST_LDFLAGS)
15
+ snappy_unittest_LDADD = libsnappy.la $(UNITTEST_LIBS) $(gflags_LIBS) $(GTEST_LIBS)
16
+ TESTS = snappy_unittest
17
+ noinst_PROGRAMS = $(TESTS)
18
+
19
+ EXTRA_DIST = autogen.sh testdata/alice29.txt testdata/asyoulik.txt testdata/baddata1.snappy testdata/baddata2.snappy testdata/baddata3.snappy testdata/geo.protodata testdata/fireworks.jpeg testdata/html testdata/html_x_4 testdata/kppkn.gtb testdata/lcet10.txt testdata/paper-100k.pdf testdata/plrabn12.txt testdata/urls.10K
20
+ dist_doc_DATA = ChangeLog COPYING INSTALL NEWS README format_description.txt framing_format.txt
21
+
22
+ libtool: $(LIBTOOL_DEPS)
23
+ $(SHELL) ./config.status --recheck
@@ -0,0 +1,128 @@
1
+ Snappy v1.1.2, February 28th 2014:
2
+
3
+ This is a maintenance release with no changes to the actual library
4
+ source code.
5
+
6
+ * Stop distributing benchmark data files that have unclear
7
+ or unsuitable licensing.
8
+
9
+ * Add support for padding chunks in the framing format.
10
+
11
+
12
+ Snappy v1.1.1, October 15th 2013:
13
+
14
+ * Add support for uncompressing to iovecs (scatter I/O).
15
+ The bulk of this patch was contributed by Mohit Aron.
16
+
17
+ * Speed up decompression by ~2%; much more so (~13-20%) on
18
+ a few benchmarks on given compilers and CPUs.
19
+
20
+ * Fix a few issues with MSVC compilation.
21
+
22
+ * Support truncated test data in the benchmark.
23
+
24
+
25
+ Snappy v1.1.0, January 18th 2013:
26
+
27
+ * Snappy now uses 64 kB block size instead of 32 kB. On average,
28
+ this means it compresses about 3% denser (more so for some
29
+ inputs), at the same or better speeds.
30
+
31
+ * libsnappy no longer depends on iostream.
32
+
33
+ * Some small performance improvements in compression on x86
34
+ (0.5–1%).
35
+
36
+ * Various portability fixes for ARM-based platforms, for MSVC,
37
+ and for GNU/Hurd.
38
+
39
+
40
+ Snappy v1.0.5, February 24th 2012:
41
+
42
+ * More speed improvements. Exactly how big will depend on
43
+ the architecture:
44
+
45
+ - 3–10% faster decompression for the base case (x86-64).
46
+
47
+ - ARMv7 and higher can now use unaligned accesses,
48
+ and will see about 30% faster decompression and
49
+ 20–40% faster compression.
50
+
51
+ - 32-bit platforms (ARM and 32-bit x86) will see 2–5%
52
+ faster compression.
53
+
54
+ These are all cumulative (e.g., ARM gets all three speedups).
55
+
56
+ * Fixed an issue where the unit test would crash on system
57
+ with less than 256 MB address space available,
58
+ e.g. some embedded platforms.
59
+
60
+ * Added a framing format description, for use over e.g. HTTP,
61
+ or for a command-line compressor. We do not have any
62
+ implementations of this at the current point, but there seems
63
+ to be enough of a general interest in the topic.
64
+ Also make the format description slightly clearer.
65
+
66
+ * Remove some compile-time warnings in -Wall
67
+ (mostly signed/unsigned comparisons), for easier embedding
68
+ into projects that use -Wall -Werror.
69
+
70
+
71
+ Snappy v1.0.4, September 15th 2011:
72
+
73
+ * Speeded up the decompressor somewhat; typically about 2–8%
74
+ for Core i7, in 64-bit mode (comparable for Opteron).
75
+ Somewhat more for some tests, almost no gain for others.
76
+
77
+ * Make Snappy compile on certain platforms it didn't before
78
+ (Solaris with SunPro C++, HP-UX, AIX).
79
+
80
+ * Correct some minor errors in the format description.
81
+
82
+
83
+ Snappy v1.0.3, June 2nd 2011:
84
+
85
+ * Speeded up the decompressor somewhat; about 3-6% for Core 2,
86
+ 6-13% for Core i7, and 5-12% for Opteron (all in 64-bit mode).
87
+
88
+ * Added compressed format documentation. This text is new,
89
+ but an earlier version from Zeev Tarantov was used as reference.
90
+
91
+ * Only link snappy_unittest against -lz and other autodetected
92
+ libraries, not libsnappy.so (which doesn't need any such dependency).
93
+
94
+ * Fixed some display issues in the microbenchmarks, one of which would
95
+ frequently make the test crash on GNU/Hurd.
96
+
97
+
98
+ Snappy v1.0.2, April 29th 2011:
99
+
100
+ * Relicense to a BSD-type license.
101
+
102
+ * Added C bindings, contributed by Martin Gieseking.
103
+
104
+ * More Win32 fixes, in particular for MSVC.
105
+
106
+ * Replace geo.protodata with a newer version.
107
+
108
+ * Fix timing inaccuracies in the unit test when comparing Snappy
109
+ to other algorithms.
110
+
111
+
112
+ Snappy v1.0.1, March 25th 2011:
113
+
114
+ This is a maintenance release, mostly containing minor fixes.
115
+ There is no new functionality. The most important fixes include:
116
+
117
+ * The COPYING file and all licensing headers now correctly state that
118
+ Snappy is licensed under the Apache 2.0 license.
119
+
120
+ * snappy_unittest should now compile natively under Windows,
121
+ as well as on embedded systems with no mmap().
122
+
123
+ * Various autotools nits have been fixed.
124
+
125
+
126
+ Snappy v1.0, March 17th 2011:
127
+
128
+ * Initial version.
@@ -0,0 +1,135 @@
1
+ Snappy, a fast compressor/decompressor.
2
+
3
+
4
+ Introduction
5
+ ============
6
+
7
+ Snappy is a compression/decompression library. It does not aim for maximum
8
+ compression, or compatibility with any other compression library; instead,
9
+ it aims for very high speeds and reasonable compression. For instance,
10
+ compared to the fastest mode of zlib, Snappy is an order of magnitude faster
11
+ for most inputs, but the resulting compressed files are anywhere from 20% to
12
+ 100% bigger. (For more information, see "Performance", below.)
13
+
14
+ Snappy has the following properties:
15
+
16
+ * Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code.
17
+ See "Performance" below.
18
+ * Stable: Over the last few years, Snappy has compressed and decompressed
19
+ petabytes of data in Google's production environment. The Snappy bitstream
20
+ format is stable and will not change between versions.
21
+ * Robust: The Snappy decompressor is designed not to crash in the face of
22
+ corrupted or malicious input.
23
+ * Free and open source software: Snappy is licensed under a BSD-type license.
24
+ For more information, see the included COPYING file.
25
+
26
+ Snappy has previously been called "Zippy" in some Google presentations
27
+ and the like.
28
+
29
+
30
+ Performance
31
+ ===========
32
+
33
+ Snappy is intended to be fast. On a single core of a Core i7 processor
34
+ in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at
35
+ about 500 MB/sec or more. (These numbers are for the slowest inputs in our
36
+ benchmark suite; others are much faster.) In our tests, Snappy usually
37
+ is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ,
38
+ etc.) while achieving comparable compression ratios.
39
+
40
+ Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x
41
+ for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and
42
+ other already-compressed data. Similar numbers for zlib in its fastest mode
43
+ are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are
44
+ capable of achieving yet higher compression rates, although usually at the
45
+ expense of speed. Of course, compression ratio will vary significantly with
46
+ the input.
47
+
48
+ Although Snappy should be fairly portable, it is primarily optimized
49
+ for 64-bit x86-compatible processors, and may run slower in other environments.
50
+ In particular:
51
+
52
+ - Snappy uses 64-bit operations in several places to process more data at
53
+ once than would otherwise be possible.
54
+ - Snappy assumes unaligned 32- and 64-bit loads and stores are cheap.
55
+ On some platforms, these must be emulated with single-byte loads
56
+ and stores, which is much slower.
57
+ - Snappy assumes little-endian throughout, and needs to byte-swap data in
58
+ several places if running on a big-endian platform.
59
+
60
+ Experience has shown that even heavily tuned code can be improved.
61
+ Performance optimizations, whether for 64-bit x86 or other platforms,
62
+ are of course most welcome; see "Contact", below.
63
+
64
+
65
+ Usage
66
+ =====
67
+
68
+ Note that Snappy, both the implementation and the main interface,
69
+ is written in C++. However, several third-party bindings to other languages
70
+ are available; see the Google Code page at http://code.google.com/p/snappy/
71
+ for more information. Also, if you want to use Snappy from C code, you can
72
+ use the included C bindings in snappy-c.h.
73
+
74
+ To use Snappy from your own C++ program, include the file "snappy.h" from
75
+ your calling file, and link against the compiled library.
76
+
77
+ There are many ways to call Snappy, but the simplest possible is
78
+
79
+ snappy::Compress(input.data(), input.size(), &output);
80
+
81
+ and similarly
82
+
83
+ snappy::Uncompress(input.data(), input.size(), &output);
84
+
85
+ where "input" and "output" are both instances of std::string.
86
+
87
+ There are other interfaces that are more flexible in various ways, including
88
+ support for custom (non-array) input sources. See the header file for more
89
+ information.
90
+
91
+
92
+ Tests and benchmarks
93
+ ====================
94
+
95
+ When you compile Snappy, snappy_unittest is compiled in addition to the
96
+ library itself. You do not need it to use the compressor from your own library,
97
+ but it contains several useful components for Snappy development.
98
+
99
+ First of all, it contains unit tests, verifying correctness on your machine in
100
+ various scenarios. If you want to change or optimize Snappy, please run the
101
+ tests to verify you have not broken anything. Note that if you have the
102
+ Google Test library installed, unit test behavior (especially failures) will be
103
+ significantly more user-friendly. You can find Google Test at
104
+
105
+ http://code.google.com/p/googletest/
106
+
107
+ You probably also want the gflags library for handling of command-line flags;
108
+ you can find it at
109
+
110
+ http://code.google.com/p/google-gflags/
111
+
112
+ In addition to the unit tests, snappy contains microbenchmarks used to
113
+ tune compression and decompression performance. These are automatically run
114
+ before the unit tests, but you can disable them using the flag
115
+ --run_microbenchmarks=false if you have gflags installed (otherwise you will
116
+ need to edit the source).
117
+
118
+ Finally, snappy can benchmark Snappy against a few other compression libraries
119
+ (zlib, LZO, LZF, FastLZ and QuickLZ), if they were detected at configure time.
120
+ To benchmark using a given file, give the compression algorithm you want to test
121
+ Snappy against (e.g. --zlib) and then a list of one or more file names on the
122
+ command line. The testdata/ directory contains the files used by the
123
+ microbenchmark, which should provide a reasonably balanced starting point for
124
+ benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they
125
+ are used to verify correctness in the presence of corrupted data in the unit
126
+ test.)
127
+
128
+
129
+ Contact
130
+ =======
131
+
132
+ Snappy is distributed through Google Code. For the latest version, a bug tracker,
133
+ and other information, see
134
+
135
+ http://code.google.com/p/snappy/
@@ -0,0 +1,7 @@
1
+ #! /bin/sh -e
2
+ rm -rf autom4te.cache
3
+ aclocal -I m4
4
+ autoheader
5
+ libtoolize --copy
6
+ automake --add-missing --copy
7
+ autoconf
@@ -0,0 +1,133 @@
1
+ m4_define([snappy_major], [1])
2
+ m4_define([snappy_minor], [1])
3
+ m4_define([snappy_patchlevel], [2])
4
+
5
+ # Libtool shared library interface versions (current:revision:age)
6
+ # Update this value for every release! (A:B:C will map to foo.so.(A-C).C.B)
7
+ # http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
8
+ m4_define([snappy_ltversion], [3:1:2])
9
+
10
+ AC_INIT([snappy], [snappy_major.snappy_minor.snappy_patchlevel])
11
+ AC_CONFIG_MACRO_DIR([m4])
12
+
13
+ # These are flags passed to automake (though they look like gcc flags!)
14
+ AM_INIT_AUTOMAKE([-Wall])
15
+
16
+ LT_INIT
17
+ AC_SUBST([LIBTOOL_DEPS])
18
+ AC_PROG_CXX
19
+ AC_LANG([C++])
20
+ AC_C_BIGENDIAN
21
+ AC_TYPE_SIZE_T
22
+ AC_TYPE_SSIZE_T
23
+ AC_CHECK_HEADERS([stdint.h stddef.h sys/mman.h sys/resource.h windows.h byteswap.h sys/byteswap.h sys/endian.h sys/time.h])
24
+
25
+ # Don't use AC_FUNC_MMAP, as it checks for mappings of already-mapped memory,
26
+ # which we don't need (and does not exist on Windows).
27
+ AC_CHECK_FUNC([mmap])
28
+
29
+ GTEST_LIB_CHECK([], [true], [true # Ignore; we can live without it.])
30
+
31
+ AC_ARG_WITH([gflags],
32
+ [AS_HELP_STRING(
33
+ [--with-gflags],
34
+ [use Google Flags package to enhance the unit test @<:@default=check@:>@])],
35
+ [],
36
+ [with_gflags=check])
37
+
38
+ if test "x$with_gflags" != "xno"; then
39
+ PKG_CHECK_MODULES(
40
+ [gflags],
41
+ [libgflags],
42
+ [AC_DEFINE([HAVE_GFLAGS], [1], [Use the gflags package for command-line parsing.])],
43
+ [if test "x$with_gflags" != "xcheck"; then
44
+ AC_MSG_FAILURE([--with-gflags was given, but test for gflags failed])
45
+ fi])
46
+ fi
47
+
48
+ # See if we have __builtin_expect.
49
+ # TODO: Use AC_CACHE.
50
+ AC_MSG_CHECKING([if the compiler supports __builtin_expect])
51
+
52
+ AC_TRY_COMPILE(, [
53
+ return __builtin_expect(1, 1) ? 1 : 0
54
+ ], [
55
+ snappy_have_builtin_expect=yes
56
+ AC_MSG_RESULT([yes])
57
+ ], [
58
+ snappy_have_builtin_expect=no
59
+ AC_MSG_RESULT([no])
60
+ ])
61
+ if test x$snappy_have_builtin_expect = xyes ; then
62
+ AC_DEFINE([HAVE_BUILTIN_EXPECT], [1], [Define to 1 if the compiler supports __builtin_expect.])
63
+ fi
64
+
65
+ # See if we have working count-trailing-zeros intrinsics.
66
+ # TODO: Use AC_CACHE.
67
+ AC_MSG_CHECKING([if the compiler supports __builtin_ctzll])
68
+
69
+ AC_TRY_COMPILE(, [
70
+ return (__builtin_ctzll(0x100000000LL) == 32) ? 1 : 0
71
+ ], [
72
+ snappy_have_builtin_ctz=yes
73
+ AC_MSG_RESULT([yes])
74
+ ], [
75
+ snappy_have_builtin_ctz=no
76
+ AC_MSG_RESULT([no])
77
+ ])
78
+ if test x$snappy_have_builtin_ctz = xyes ; then
79
+ AC_DEFINE([HAVE_BUILTIN_CTZ], [1], [Define to 1 if the compiler supports __builtin_ctz and friends.])
80
+ fi
81
+
82
+ # Other compression libraries; the unit test can use these for comparison
83
+ # if they are available. If they are not found, just ignore.
84
+ UNITTEST_LIBS=""
85
+ AC_DEFUN([CHECK_EXT_COMPRESSION_LIB], [
86
+ AH_CHECK_LIB([$1])
87
+ AC_CHECK_LIB(
88
+ [$1],
89
+ [$2],
90
+ [
91
+ AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_LIB$1))
92
+ UNITTEST_LIBS="-l$1 $UNITTEST_LIBS"
93
+ ],
94
+ [true]
95
+ )
96
+ ])
97
+ CHECK_EXT_COMPRESSION_LIB([z], [zlibVersion])
98
+ CHECK_EXT_COMPRESSION_LIB([lzo2], [lzo1x_1_15_compress])
99
+ CHECK_EXT_COMPRESSION_LIB([lzf], [lzf_compress])
100
+ CHECK_EXT_COMPRESSION_LIB([fastlz], [fastlz_compress])
101
+ CHECK_EXT_COMPRESSION_LIB([quicklz], [qlz_compress])
102
+ AC_SUBST([UNITTEST_LIBS])
103
+
104
+ # These are used by snappy-stubs-public.h.in.
105
+ if test "$ac_cv_header_stdint_h" = "yes"; then
106
+ AC_SUBST([ac_cv_have_stdint_h], [1])
107
+ else
108
+ AC_SUBST([ac_cv_have_stdint_h], [0])
109
+ fi
110
+ if test "$ac_cv_header_stddef_h" = "yes"; then
111
+ AC_SUBST([ac_cv_have_stddef_h], [1])
112
+ else
113
+ AC_SUBST([ac_cv_have_stddef_h], [0])
114
+ fi
115
+ if test "$ac_cv_header_sys_uio_h" = "yes"; then
116
+ AC_SUBST([ac_cv_have_sys_uio_h], [1])
117
+ else
118
+ AC_SUBST([ac_cv_have_sys_uio_h], [0])
119
+ fi
120
+
121
+ # Export the version to snappy-stubs-public.h.
122
+ SNAPPY_MAJOR="snappy_major"
123
+ SNAPPY_MINOR="snappy_minor"
124
+ SNAPPY_PATCHLEVEL="snappy_patchlevel"
125
+
126
+ AC_SUBST([SNAPPY_MAJOR])
127
+ AC_SUBST([SNAPPY_MINOR])
128
+ AC_SUBST([SNAPPY_PATCHLEVEL])
129
+ AC_SUBST([SNAPPY_LTVERSION], snappy_ltversion)
130
+
131
+ AC_CONFIG_HEADERS([config.h])
132
+ AC_CONFIG_FILES([Makefile snappy-stubs-public.h])
133
+ AC_OUTPUT
@@ -0,0 +1,110 @@
1
+ Snappy compressed format description
2
+ Last revised: 2011-10-05
3
+
4
+
5
+ This is not a formal specification, but should suffice to explain most
6
+ relevant parts of how the Snappy format works. It is originally based on
7
+ text by Zeev Tarantov.
8
+
9
+ Snappy is a LZ77-type compressor with a fixed, byte-oriented encoding.
10
+ There is no entropy encoder backend nor framing layer -- the latter is
11
+ assumed to be handled by other parts of the system.
12
+
13
+ This document only describes the format, not how the Snappy compressor nor
14
+ decompressor actually works. The correctness of the decompressor should not
15
+ depend on implementation details of the compressor, and vice versa.
16
+
17
+
18
+ 1. Preamble
19
+
20
+ The stream starts with the uncompressed length (up to a maximum of 2^32 - 1),
21
+ stored as a little-endian varint. Varints consist of a series of bytes,
22
+ where the lower 7 bits are data and the upper bit is set iff there are
23
+ more bytes to be read. In other words, an uncompressed length of 64 would
24
+ be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
25
+ would be stored as 0xFE 0xFF 0x7F.
26
+
27
+
28
+ 2. The compressed stream itself
29
+
30
+ There are two types of elements in a Snappy stream: Literals and
31
+ copies (backreferences). There is no restriction on the order of elements,
32
+ except that the stream naturally cannot start with a copy. (Having
33
+ two literals in a row is never optimal from a compression point of
34
+ view, but nevertheless fully permitted.) Each element starts with a tag byte,
35
+ and the lower two bits of this tag byte signal what type of element will
36
+ follow:
37
+
38
+ 00: Literal
39
+ 01: Copy with 1-byte offset
40
+ 10: Copy with 2-byte offset
41
+ 11: Copy with 4-byte offset
42
+
43
+ The interpretation of the upper six bits are element-dependent.
44
+
45
+
46
+ 2.1. Literals (00)
47
+
48
+ Literals are uncompressed data stored directly in the byte stream.
49
+ The literal length is stored differently depending on the length
50
+ of the literal:
51
+
52
+ - For literals up to and including 60 bytes in length, the upper
53
+ six bits of the tag byte contain (len-1). The literal follows
54
+ immediately thereafter in the bytestream.
55
+ - For longer literals, the (len-1) value is stored after the tag byte,
56
+ little-endian. The upper six bits of the tag byte describe how
57
+ many bytes are used for the length; 60, 61, 62 or 63 for
58
+ 1-4 bytes, respectively. The literal itself follows after the
59
+ length.
60
+
61
+
62
+ 2.2. Copies
63
+
64
+ Copies are references back into previous decompressed data, telling
65
+ the decompressor to reuse data it has previously decoded.
66
+ They encode two values: The _offset_, saying how many bytes back
67
+ from the current position to read, and the _length_, how many bytes
68
+ to copy. Offsets of zero can be encoded, but are not legal;
69
+ similarly, it is possible to encode backreferences that would
70
+ go past the end of the block (offset > current decompressed position),
71
+ which is also nonsensical and thus not allowed.
72
+
73
+ As in most LZ77-based compressors, the length can be larger than the offset,
74
+ yielding a form of run-length encoding (RLE). For instance,
75
+ "xababab" could be encoded as
76
+
77
+ <literal: "xab"> <copy: offset=2 length=4>
78
+
79
+ Note that since the current Snappy compressor works in 32 kB
80
+ blocks and does not do matching across blocks, it will never produce
81
+ a bitstream with offsets larger than about 32768. However, the
82
+ decompressor should not rely on this, as it may change in the future.
83
+
84
+ There are several different kinds of copy elements, depending on
85
+ the amount of bytes to be copied (length), and how far back the
86
+ data to be copied is (offset).
87
+
88
+
89
+ 2.2.1. Copy with 1-byte offset (01)
90
+
91
+ These elements can encode lengths between [4..11] bytes and offsets
92
+ between [0..2047] bytes. (len-4) occupies three bits and is stored
93
+ in bits [2..4] of the tag byte. The offset occupies 11 bits, of which the
94
+ upper three are stored in the upper three bits ([5..7]) of the tag byte,
95
+ and the lower eight are stored in a byte following the tag byte.
96
+
97
+
98
+ 2.2.2. Copy with 2-byte offset (10)
99
+
100
+ These elements can encode lengths between [1..64] and offsets from
101
+ [0..65535]. (len-1) occupies six bits and is stored in the upper
102
+ six bits ([2..7]) of the tag byte. The offset is stored as a
103
+ little-endian 16-bit integer in the two bytes following the tag byte.
104
+
105
+
106
+ 2.2.3. Copy with 4-byte offset (11)
107
+
108
+ These are like the copies with 2-byte offsets (see previous subsection),
109
+ except that the offset is stored as a 32-bit integer instead of a
110
+ 16-bit integer (and thus will occupy four bytes).