RubyGems - snappy - Versions diffs - 0.0.13 → 0.1.0 - Mend

snappy 0.0.13 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

checksums.yaml +5 -5
data/.travis.yml +28 -1
data/Gemfile +6 -1
data/README.md +28 -4
data/Rakefile +1 -0
data/ext/extconf.rb +21 -24
data/lib/snappy.rb +3 -1
data/lib/snappy/hadoop.rb +22 -0
data/lib/snappy/hadoop/reader.rb +58 -0
data/lib/snappy/hadoop/writer.rb +51 -0
data/lib/snappy/reader.rb +11 -7
data/lib/snappy/shim.rb +30 -0
data/lib/snappy/version.rb +3 -1
data/lib/snappy/writer.rb +8 -9
data/smoke.sh +8 -0
data/snappy.gemspec +6 -30
data/test/hadoop/test-snappy-hadoop-reader.rb +103 -0
data/test/hadoop/test-snappy-hadoop-writer.rb +48 -0
data/test/test-snappy-hadoop.rb +22 -0
data/vendor/snappy/AUTHORS +1 -0
data/vendor/snappy/CMakeLists.txt +174 -0
data/vendor/snappy/CONTRIBUTING.md +26 -0
data/vendor/snappy/COPYING +54 -0
data/vendor/snappy/NEWS +180 -0
data/vendor/snappy/README.md +149 -0
data/vendor/snappy/cmake/SnappyConfig.cmake +1 -0
data/vendor/snappy/cmake/config.h.in +62 -0
data/vendor/snappy/format_description.txt +110 -0
data/vendor/snappy/framing_format.txt +135 -0
data/vendor/snappy/snappy-c.cc +90 -0
data/vendor/snappy/snappy-c.h +138 -0
data/vendor/snappy/snappy-internal.h +224 -0
data/vendor/snappy/snappy-sinksource.cc +104 -0
data/vendor/snappy/snappy-sinksource.h +182 -0
data/vendor/snappy/snappy-stubs-internal.cc +42 -0
data/vendor/snappy/snappy-stubs-internal.h +561 -0
data/vendor/snappy/snappy-stubs-public.h.in +94 -0
data/vendor/snappy/snappy-test.cc +612 -0
data/vendor/snappy/snappy-test.h +573 -0
data/vendor/snappy/snappy.cc +1515 -0
data/vendor/snappy/snappy.h +203 -0
data/vendor/snappy/snappy_unittest.cc +1410 -0
metadata +38 -46

data/vendor/snappy/NEWS ADDED

@@ -0,0 +1,180 @@
+Snappy v1.1.7, August 24th 2017:
+  * Improved CMake build support for 64-bit Linux distributions.
+  * MSVC builds now use MSVC-specific intrinsics that map to clzll.
+  * ARM64 (AArch64) builds use the code paths optimized for 64-bit processors.
+Snappy v1.1.6, July 12th 2017:
+This is a re-release of v1.1.5 with proper SONAME / SOVERSION values.
+Snappy v1.1.5, June 28th 2017:
+This release has broken SONAME / SOVERSION values. Users of snappy as a shared
+library should avoid 1.1.5 and use 1.1.6 instead. SONAME / SOVERSION errors will
+manifest as the dynamic library loader complaining that it cannot find snappy's
+shared library file (libsnappy.so / libsnappy.dylib), or that the library it
+found does not have the required version. 1.1.6 has the same code as 1.1.5, but
+carries build configuration fixes for the issues above.
+  * Add CMake build support. The autoconf build support is now deprecated, and
+    will be removed in the next release.
+  * Add AppVeyor configuration, for Windows CI coverage.
+  * Small performance improvement on little-endian PowerPC.
+  * Small performance improvement on LLVM with position-independent executables.
+  * Fix a few issues with various build environments.
+Snappy v1.1.4, January 25th 2017:
+  * Fix a 1% performance regression when snappy is used in PIE executables.
+  * Improve compression performance by 5%.
+  * Improve decompression performance by 20%.
+Snappy v1.1.3, July 6th 2015:
+This is the first release to be done from GitHub, which means that
+some minor things like the ChangeLog format has changed (git log
+format instead of svn log).
+  * Add support for Uncompress() from a Source to a Sink.
+  * Various minor changes to improve MSVC support; in particular,
+    the unit tests now compile and run under MSVC.
+Snappy v1.1.2, February 28th 2014:
+This is a maintenance release with no changes to the actual library
+source code.
+  * Stop distributing benchmark data files that have unclear
+    or unsuitable licensing.
+  * Add support for padding chunks in the framing format.
+Snappy v1.1.1, October 15th 2013:
+  * Add support for uncompressing to iovecs (scatter I/O).
+    The bulk of this patch was contributed by Mohit Aron.
+  * Speed up decompression by ~2%; much more so (~13-20%) on
+    a few benchmarks on given compilers and CPUs.
+  * Fix a few issues with MSVC compilation.
+  * Support truncated test data in the benchmark.
+Snappy v1.1.0, January 18th 2013:
+  * Snappy now uses 64 kB block size instead of 32 kB. On average,
+    this means it compresses about 3% denser (more so for some
+    inputs), at the same or better speeds.
+  * libsnappy no longer depends on iostream.
+  * Some small performance improvements in compression on x86
+    (0.5–1%).
+  * Various portability fixes for ARM-based platforms, for MSVC,
+    and for GNU/Hurd.
+Snappy v1.0.5, February 24th 2012:
+  * More speed improvements. Exactly how big will depend on
+    the architecture:
+    - 3–10% faster decompression for the base case (x86-64).
+    - ARMv7 and higher can now use unaligned accesses,
+      and will see about 30% faster decompression and
+      20–40% faster compression.
+    - 32-bit platforms (ARM and 32-bit x86) will see 2–5%
+      faster compression.
+    These are all cumulative (e.g., ARM gets all three speedups).
+  * Fixed an issue where the unit test would crash on system
+    with less than 256 MB address space available,
+    e.g. some embedded platforms.
+  * Added a framing format description, for use over e.g. HTTP,
+    or for a command-line compressor. We do not have any
+    implementations of this at the current point, but there seems
+    to be enough of a general interest in the topic.
+    Also make the format description slightly clearer.
+  * Remove some compile-time warnings in -Wall
+    (mostly signed/unsigned comparisons), for easier embedding
+    into projects that use -Wall -Werror.
+Snappy v1.0.4, September 15th 2011:
+  * Speeded up the decompressor somewhat; typically about 2–8%
+    for Core i7, in 64-bit mode (comparable for Opteron).
+    Somewhat more for some tests, almost no gain for others.
+  * Make Snappy compile on certain platforms it didn't before
+    (Solaris with SunPro C++, HP-UX, AIX).
+  * Correct some minor errors in the format description.
+Snappy v1.0.3, June 2nd 2011:
+  * Speeded up the decompressor somewhat; about 3-6% for Core 2,
+    6-13% for Core i7, and 5-12% for Opteron (all in 64-bit mode).
+  * Added compressed format documentation. This text is new,
+    but an earlier version from Zeev Tarantov was used as reference.
+  * Only link snappy_unittest against -lz and other autodetected
+    libraries, not libsnappy.so (which doesn't need any such dependency).
+  * Fixed some display issues in the microbenchmarks, one of which would
+    frequently make the test crash on GNU/Hurd.
+Snappy v1.0.2, April 29th 2011:
+  * Relicense to a BSD-type license.
+  * Added C bindings, contributed by Martin Gieseking.
+  * More Win32 fixes, in particular for MSVC.
+  * Replace geo.protodata with a newer version.
+  * Fix timing inaccuracies in the unit test when comparing Snappy
+    to other algorithms.
+Snappy v1.0.1, March 25th 2011:
+This is a maintenance release, mostly containing minor fixes.
+There is no new functionality. The most important fixes include:
+  * The COPYING file and all licensing headers now correctly state that
+    Snappy is licensed under the Apache 2.0 license.
+  * snappy_unittest should now compile natively under Windows,
+    as well as on embedded systems with no mmap().
+  * Various autotools nits have been fixed.
+Snappy v1.0, March 17th 2011:
+  * Initial version.

data/vendor/snappy/README.md ADDED

@@ -0,0 +1,149 @@
+Snappy, a fast compressor/decompressor.
+Introduction
+============
+Snappy is a compression/decompression library. It does not aim for maximum
+compression, or compatibility with any other compression library; instead,
+it aims for very high speeds and reasonable compression. For instance,
+compared to the fastest mode of zlib, Snappy is an order of magnitude faster
+for most inputs, but the resulting compressed files are anywhere from 20% to
+100% bigger. (For more information, see "Performance", below.)
+Snappy has the following properties:
+ * Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code.
+   See "Performance" below.
+ * Stable: Over the last few years, Snappy has compressed and decompressed
+   petabytes of data in Google's production environment. The Snappy bitstream
+   format is stable and will not change between versions.
+ * Robust: The Snappy decompressor is designed not to crash in the face of
+   corrupted or malicious input.
+ * Free and open source software: Snappy is licensed under a BSD-type license.
+   For more information, see the included COPYING file.
+Snappy has previously been called "Zippy" in some Google presentations
+and the like.
+Performance
+===========
+Snappy is intended to be fast. On a single core of a Core i7 processor
+in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at
+about 500 MB/sec or more. (These numbers are for the slowest inputs in our
+benchmark suite; others are much faster.) In our tests, Snappy usually
+is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ,
+etc.) while achieving comparable compression ratios.
+Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x
+for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and
+other already-compressed data. Similar numbers for zlib in its fastest mode
+are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are
+capable of achieving yet higher compression rates, although usually at the
+expense of speed. Of course, compression ratio will vary significantly with
+the input.
+Although Snappy should be fairly portable, it is primarily optimized
+for 64-bit x86-compatible processors, and may run slower in other environments.
+In particular:
+ - Snappy uses 64-bit operations in several places to process more data at
+   once than would otherwise be possible.
+ - Snappy assumes unaligned 32- and 64-bit loads and stores are cheap.
+   On some platforms, these must be emulated with single-byte loads
+   and stores, which is much slower.
+ - Snappy assumes little-endian throughout, and needs to byte-swap data in
+   several places if running on a big-endian platform.
+Experience has shown that even heavily tuned code can be improved.
+Performance optimizations, whether for 64-bit x86 or other platforms,
+are of course most welcome; see "Contact", below.
+Building
+========
+CMake is supported and autotools will soon be deprecated.
+You need CMake 3.4 or above to build:
+  mkdir build
+  cd build && cmake ../ && make
+Usage
+=====
+Note that Snappy, both the implementation and the main interface,
+is written in C++. However, several third-party bindings to other languages
+are available; see the home page at http://google.github.io/snappy/
+for more information. Also, if you want to use Snappy from C code, you can
+use the included C bindings in snappy-c.h.
+To use Snappy from your own C++ program, include the file "snappy.h" from
+your calling file, and link against the compiled library.
+There are many ways to call Snappy, but the simplest possible is
+  snappy::Compress(input.data(), input.size(), &output);
+and similarly
+  snappy::Uncompress(input.data(), input.size(), &output);
+where "input" and "output" are both instances of std::string.
+There are other interfaces that are more flexible in various ways, including
+support for custom (non-array) input sources. See the header file for more
+information.
+Tests and benchmarks
+====================
+When you compile Snappy, snappy_unittest is compiled in addition to the
+library itself. You do not need it to use the compressor from your own library,
+but it contains several useful components for Snappy development.
+First of all, it contains unit tests, verifying correctness on your machine in
+various scenarios. If you want to change or optimize Snappy, please run the
+tests to verify you have not broken anything. Note that if you have the
+Google Test library installed, unit test behavior (especially failures) will be
+significantly more user-friendly. You can find Google Test at
+  http://github.com/google/googletest
+You probably also want the gflags library for handling of command-line flags;
+you can find it at
+  http://gflags.github.io/gflags/
+In addition to the unit tests, snappy contains microbenchmarks used to
+tune compression and decompression performance. These are automatically run
+before the unit tests, but you can disable them using the flag
+--run_microbenchmarks=false if you have gflags installed (otherwise you will
+need to edit the source).
+Finally, snappy can benchmark Snappy against a few other compression libraries
+(zlib, LZO, LZF, and QuickLZ), if they were detected at configure time.
+To benchmark using a given file, give the compression algorithm you want to test
+Snappy against (e.g. --zlib) and then a list of one or more file names on the
+command line. The testdata/ directory contains the files used by the
+microbenchmark, which should provide a reasonably balanced starting point for
+benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they
+are used to verify correctness in the presence of corrupted data in the unit
+test.)
+Contact
+=======
+Snappy is distributed through GitHub. For the latest version, a bug tracker,
+and other information, see
+  http://google.github.io/snappy/
+or the repository at
+  https://github.com/google/snappy

data/vendor/snappy/cmake/SnappyConfig.cmake ADDED

	@@ -0,0 +1 @@
1	+ include("${CMAKE_CURRENT_LIST_DIR}/SnappyTargets.cmake")

data/vendor/snappy/cmake/config.h.in ADDED

@@ -0,0 +1,62 @@
+#ifndef THIRD_PARTY_SNAPPY_OPENSOURCE_CMAKE_CONFIG_H_
+#define THIRD_PARTY_SNAPPY_OPENSOURCE_CMAKE_CONFIG_H_
+/* Define to 1 if the compiler supports __builtin_ctz and friends. */
+#cmakedefine HAVE_BUILTIN_CTZ 1
+/* Define to 1 if the compiler supports __builtin_expect. */
+#cmakedefine HAVE_BUILTIN_EXPECT 1
+/* Define to 1 if you have the <byteswap.h> header file. */
+#cmakedefine HAVE_BYTESWAP_H 1
+/* Define to 1 if you have a definition for mmap() in <sys/mman.h>. */
+#cmakedefine HAVE_FUNC_MMAP 1
+/* Define to 1 if you have a definition for sysconf() in <unistd.h>. */
+#cmakedefine HAVE_FUNC_SYSCONF 1
+/* Define to 1 to use the gflags package for command-line parsing. */
+#cmakedefine HAVE_GFLAGS 1
+/* Define to 1 if you have Google Test. */
+#cmakedefine HAVE_GTEST 1
+/* Define to 1 if you have the `lzo2' library (-llzo2). */
+#cmakedefine HAVE_LIBLZO2 1
+/* Define to 1 if you have the `z' library (-lz). */
+#cmakedefine HAVE_LIBZ 1
+/* Define to 1 if you have the <stddef.h> header file. */
+#cmakedefine HAVE_STDDEF_H 1
+/* Define to 1 if you have the <stdint.h> header file. */
+#cmakedefine HAVE_STDINT_H 1
+/* Define to 1 if you have the <sys/endian.h> header file. */
+#cmakedefine HAVE_SYS_ENDIAN_H 1
+/* Define to 1 if you have the <sys/mman.h> header file. */
+#cmakedefine HAVE_SYS_MMAN_H 1
+/* Define to 1 if you have the <sys/resource.h> header file. */
+#cmakedefine HAVE_SYS_RESOURCE_H 1
+/* Define to 1 if you have the <sys/time.h> header file. */
+#cmakedefine HAVE_SYS_TIME_H 1
+/* Define to 1 if you have the <sys/uio.h> header file. */
+#cmakedefine HAVE_SYS_UIO_H 1
+/* Define to 1 if you have the <unistd.h> header file. */
+#cmakedefine HAVE_UNISTD_H 1
+/* Define to 1 if you have the <windows.h> header file. */
+#cmakedefine HAVE_WINDOWS_H 1
+/* Define to 1 if your processor stores words with the most significant byte
+   first (like Motorola and SPARC, unlike Intel and VAX). */
+#cmakedefine SNAPPY_IS_BIG_ENDIAN 1
+#endif  // THIRD_PARTY_SNAPPY_OPENSOURCE_CMAKE_CONFIG_H_

data/vendor/snappy/format_description.txt ADDED

@@ -0,0 +1,110 @@
+Snappy compressed format description
+Last revised: 2011-10-05
+This is not a formal specification, but should suffice to explain most
+relevant parts of how the Snappy format works. It is originally based on
+text by Zeev Tarantov.
+Snappy is a LZ77-type compressor with a fixed, byte-oriented encoding.
+There is no entropy encoder backend nor framing layer -- the latter is
+assumed to be handled by other parts of the system.
+This document only describes the format, not how the Snappy compressor nor
+decompressor actually works. The correctness of the decompressor should not
+depend on implementation details of the compressor, and vice versa.
+1. Preamble
+The stream starts with the uncompressed length (up to a maximum of 2^32 - 1),
+stored as a little-endian varint. Varints consist of a series of bytes,
+where the lower 7 bits are data and the upper bit is set iff there are
+more bytes to be read. In other words, an uncompressed length of 64 would
+be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
+would be stored as 0xFE 0xFF 0x7F.
+2. The compressed stream itself
+There are two types of elements in a Snappy stream: Literals and
+copies (backreferences). There is no restriction on the order of elements,
+except that the stream naturally cannot start with a copy. (Having
+two literals in a row is never optimal from a compression point of
+view, but nevertheless fully permitted.) Each element starts with a tag byte,
+and the lower two bits of this tag byte signal what type of element will
+follow:
+  00: Literal
+  01: Copy with 1-byte offset
+  10: Copy with 2-byte offset
+  11: Copy with 4-byte offset
+The interpretation of the upper six bits are element-dependent.
+2.1. Literals (00)
+Literals are uncompressed data stored directly in the byte stream.
+The literal length is stored differently depending on the length
+of the literal:
+ - For literals up to and including 60 bytes in length, the upper
+   six bits of the tag byte contain (len-1). The literal follows
+   immediately thereafter in the bytestream.
+ - For longer literals, the (len-1) value is stored after the tag byte,
+   little-endian. The upper six bits of the tag byte describe how
+   many bytes are used for the length; 60, 61, 62 or 63 for
+   1-4 bytes, respectively. The literal itself follows after the
+   length.
+2.2. Copies
+Copies are references back into previous decompressed data, telling
+the decompressor to reuse data it has previously decoded.
+They encode two values: The _offset_, saying how many bytes back
+from the current position to read, and the _length_, how many bytes
+to copy. Offsets of zero can be encoded, but are not legal;
+similarly, it is possible to encode backreferences that would
+go past the end of the block (offset > current decompressed position),
+which is also nonsensical and thus not allowed.
+As in most LZ77-based compressors, the length can be larger than the offset,
+yielding a form of run-length encoding (RLE). For instance,
+"xababab" could be encoded as
+  <literal: "xab"> <copy: offset=2 length=4>
+Note that since the current Snappy compressor works in 32 kB
+blocks and does not do matching across blocks, it will never produce
+a bitstream with offsets larger than about 32768. However, the
+decompressor should not rely on this, as it may change in the future.
+There are several different kinds of copy elements, depending on
+the amount of bytes to be copied (length), and how far back the
+data to be copied is (offset).
+2.2.1. Copy with 1-byte offset (01)
+These elements can encode lengths between [4..11] bytes and offsets
+between [0..2047] bytes. (len-4) occupies three bits and is stored
+in bits [2..4] of the tag byte. The offset occupies 11 bits, of which the
+upper three are stored in the upper three bits ([5..7]) of the tag byte,
+and the lower eight are stored in a byte following the tag byte.
+2.2.2. Copy with 2-byte offset (10)
+These elements can encode lengths between [1..64] and offsets from
+[0..65535]. (len-1) occupies six bits and is stored in the upper
+six bits ([2..7]) of the tag byte. The offset is stored as a
+little-endian 16-bit integer in the two bytes following the tag byte.
+2.2.3. Copy with 4-byte offset (11)
+These are like the copies with 2-byte offsets (see previous subsection),
+except that the offset is stored as a 32-bit integer instead of a
+16-bit integer (and thus will occupy four bytes).