snappy 0.0.10-java → 0.0.11-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. checksums.yaml +4 -4
  2. data/.gitmodules +3 -0
  3. data/Rakefile +12 -13
  4. data/ext/extconf.rb +22 -31
  5. data/lib/snappy/reader.rb +10 -7
  6. data/lib/snappy/version.rb +1 -1
  7. data/snappy.gemspec +24 -0
  8. data/test/test-snappy-reader.rb +16 -0
  9. data/vendor/snappy/AUTHORS +1 -0
  10. data/vendor/snappy/COPYING +54 -0
  11. data/vendor/snappy/ChangeLog +1916 -0
  12. data/vendor/snappy/Makefile.am +23 -0
  13. data/vendor/snappy/NEWS +128 -0
  14. data/vendor/snappy/README +135 -0
  15. data/vendor/snappy/autogen.sh +7 -0
  16. data/vendor/snappy/configure.ac +133 -0
  17. data/vendor/snappy/format_description.txt +110 -0
  18. data/vendor/snappy/framing_format.txt +135 -0
  19. data/vendor/snappy/m4/gtest.m4 +74 -0
  20. data/vendor/snappy/snappy-c.cc +90 -0
  21. data/vendor/snappy/snappy-c.h +138 -0
  22. data/vendor/snappy/snappy-internal.h +150 -0
  23. data/vendor/snappy/snappy-sinksource.cc +71 -0
  24. data/vendor/snappy/snappy-sinksource.h +137 -0
  25. data/vendor/snappy/snappy-stubs-internal.cc +42 -0
  26. data/vendor/snappy/snappy-stubs-internal.h +491 -0
  27. data/vendor/snappy/snappy-stubs-public.h.in +98 -0
  28. data/vendor/snappy/snappy-test.cc +606 -0
  29. data/vendor/snappy/snappy-test.h +582 -0
  30. data/vendor/snappy/snappy.cc +1306 -0
  31. data/vendor/snappy/snappy.h +184 -0
  32. data/vendor/snappy/snappy_unittest.cc +1355 -0
  33. data/vendor/snappy/testdata/alice29.txt +3609 -0
  34. data/vendor/snappy/testdata/asyoulik.txt +4122 -0
  35. data/vendor/snappy/testdata/baddata1.snappy +0 -0
  36. data/vendor/snappy/testdata/baddata2.snappy +0 -0
  37. data/vendor/snappy/testdata/baddata3.snappy +0 -0
  38. data/vendor/snappy/testdata/fireworks.jpeg +0 -0
  39. data/vendor/snappy/testdata/geo.protodata +0 -0
  40. data/vendor/snappy/testdata/html +1 -0
  41. data/vendor/snappy/testdata/html_x_4 +1 -0
  42. data/vendor/snappy/testdata/kppkn.gtb +0 -0
  43. data/vendor/snappy/testdata/lcet10.txt +7519 -0
  44. data/vendor/snappy/testdata/paper-100k.pdf +600 -2
  45. data/vendor/snappy/testdata/plrabn12.txt +10699 -0
  46. data/vendor/snappy/testdata/urls.10K +10000 -0
  47. metadata +57 -18
@@ -0,0 +1,135 @@
1
+ Snappy framing format description
2
+ Last revised: 2013-10-25
3
+
4
+ This format decribes a framing format for Snappy, allowing compressing to
5
+ files or streams that can then more easily be decompressed without having
6
+ to hold the entire stream in memory. It also provides data checksums to
7
+ help verify integrity. It does not provide metadata checksums, so it does
8
+ not protect against e.g. all forms of truncations.
9
+
10
+ Implementation of the framing format is optional for Snappy compressors and
11
+ decompressor; it is not part of the Snappy core specification.
12
+
13
+
14
+ 1. General structure
15
+
16
+ The file consists solely of chunks, lying back-to-back with no padding
17
+ in between. Each chunk consists first a single byte of chunk identifier,
18
+ then a three-byte little-endian length of the chunk in bytes (from 0 to
19
+ 16777215, inclusive), and then the data if any. The four bytes of chunk
20
+ header is not counted in the data length.
21
+
22
+ The different chunk types are listed below. The first chunk must always
23
+ be the stream identifier chunk (see section 4.1, below). The stream
24
+ ends when the file ends -- there is no explicit end-of-file marker.
25
+
26
+
27
+ 2. File type identification
28
+
29
+ The following identifiers for this format are recommended where appropriate.
30
+ However, note that none have been registered officially, so this is only to
31
+ be taken as a guideline. We use "Snappy framed" to distinguish between this
32
+ format and raw Snappy data.
33
+
34
+ File extension: .sz
35
+ MIME type: application/x-snappy-framed
36
+ HTTP Content-Encoding: x-snappy-framed
37
+
38
+
39
+ 3. Checksum format
40
+
41
+ Some chunks have data protected by a checksum (the ones that do will say so
42
+ explicitly). The checksums are always masked CRC-32Cs.
43
+
44
+ A description of CRC-32C can be found in RFC 3720, section 12.1, with
45
+ examples in section B.4.
46
+
47
+ Checksums are not stored directly, but masked, as checksumming data and
48
+ then its own checksum can be problematic. The masking is the same as used
49
+ in Apache Hadoop: Rotate the checksum by 15 bits, then add the constant
50
+ 0xa282ead8 (using wraparound as normal for unsigned integers). This is
51
+ equivalent to the following C code:
52
+
53
+ uint32_t mask_checksum(uint32_t x) {
54
+ return ((x >> 15) | (x << 17)) + 0xa282ead8;
55
+ }
56
+
57
+ Note that the masking is reversible.
58
+
59
+ The checksum is always stored as a four bytes long integer, in little-endian.
60
+
61
+
62
+ 4. Chunk types
63
+
64
+ The currently supported chunk types are described below. The list may
65
+ be extended in the future.
66
+
67
+
68
+ 4.1. Stream identifier (chunk type 0xff)
69
+
70
+ The stream identifier is always the first element in the stream.
71
+ It is exactly six bytes long and contains "sNaPpY" in ASCII. This means that
72
+ a valid Snappy framed stream always starts with the bytes
73
+
74
+ 0xff 0x06 0x00 0x00 0x73 0x4e 0x61 0x50 0x70 0x59
75
+
76
+ The stream identifier chunk can come multiple times in the stream besides
77
+ the first; if such a chunk shows up, it should simply be ignored, assuming
78
+ it has the right length and contents. This allows for easy concatenation of
79
+ compressed files without the need for re-framing.
80
+
81
+
82
+ 4.2. Compressed data (chunk type 0x00)
83
+
84
+ Compressed data chunks contain a normal Snappy compressed bitstream;
85
+ see the compressed format specification. The compressed data is preceded by
86
+ the CRC-32C (see section 3) of the _uncompressed_ data.
87
+
88
+ Note that the data portion of the chunk, i.e., the compressed contents,
89
+ can be at most 16777211 bytes (2^24 - 1, minus the checksum).
90
+ However, we place an additional restriction that the uncompressed data
91
+ in a chunk must be no longer than 65536 bytes. This allows consumers to
92
+ easily use small fixed-size buffers.
93
+
94
+
95
+ 4.3. Uncompressed data (chunk type 0x01)
96
+
97
+ Uncompressed data chunks allow a compressor to send uncompressed,
98
+ raw data; this is useful if, for instance, uncompressible or
99
+ near-incompressible data is detected, and faster decompression is desired.
100
+
101
+ As in the compressed chunks, the data is preceded by its own masked
102
+ CRC-32C (see section 3).
103
+
104
+ An uncompressed data chunk, like compressed data chunks, should contain
105
+ no more than 65536 data bytes, so the maximum legal chunk length with the
106
+ checksum is 65540.
107
+
108
+
109
+ 4.4. Padding (chunk type 0xfe)
110
+
111
+ Padding chunks allow a compressor to increase the size of the data stream
112
+ so that it complies with external demands, e.g. that the total number of
113
+ bytes is a multiple of some value.
114
+
115
+ All bytes of the padding chunk, except the chunk byte itself and the length,
116
+ should be zero, but decompressors must not try to interpret or verify the
117
+ padding data in any way.
118
+
119
+
120
+ 4.5. Reserved unskippable chunks (chunk types 0x02-0x7f)
121
+
122
+ These are reserved for future expansion. A decoder that sees such a chunk
123
+ should immediately return an error, as it must assume it cannot decode the
124
+ stream correctly.
125
+
126
+ Future versions of this specification may define meanings for these chunks.
127
+
128
+
129
+ 4.6. Reserved skippable chunks (chunk types 0x80-0xfd)
130
+
131
+ These are also reserved for future expansion, but unlike the chunks
132
+ described in 4.5, a decoder seeing these must skip them and continue
133
+ decoding.
134
+
135
+ Future versions of this specification may define meanings for these chunks.
@@ -0,0 +1,74 @@
1
+ dnl GTEST_LIB_CHECK([minimum version [,
2
+ dnl action if found [,action if not found]]])
3
+ dnl
4
+ dnl Check for the presence of the Google Test library, optionally at a minimum
5
+ dnl version, and indicate a viable version with the HAVE_GTEST flag. It defines
6
+ dnl standard variables for substitution including GTEST_CPPFLAGS,
7
+ dnl GTEST_CXXFLAGS, GTEST_LDFLAGS, and GTEST_LIBS. It also defines
8
+ dnl GTEST_VERSION as the version of Google Test found. Finally, it provides
9
+ dnl optional custom action slots in the event GTEST is found or not.
10
+ AC_DEFUN([GTEST_LIB_CHECK],
11
+ [
12
+ dnl Provide a flag to enable or disable Google Test usage.
13
+ AC_ARG_ENABLE([gtest],
14
+ [AS_HELP_STRING([--enable-gtest],
15
+ [Enable tests using the Google C++ Testing Framework.
16
+ (Default is enabled.)])],
17
+ [],
18
+ [enable_gtest=])
19
+ AC_ARG_VAR([GTEST_CONFIG],
20
+ [The exact path of Google Test's 'gtest-config' script.])
21
+ AC_ARG_VAR([GTEST_CPPFLAGS],
22
+ [C-like preprocessor flags for Google Test.])
23
+ AC_ARG_VAR([GTEST_CXXFLAGS],
24
+ [C++ compile flags for Google Test.])
25
+ AC_ARG_VAR([GTEST_LDFLAGS],
26
+ [Linker path and option flags for Google Test.])
27
+ AC_ARG_VAR([GTEST_LIBS],
28
+ [Library linking flags for Google Test.])
29
+ AC_ARG_VAR([GTEST_VERSION],
30
+ [The version of Google Test available.])
31
+ HAVE_GTEST="no"
32
+ AS_IF([test "x${enable_gtest}" != "xno"],
33
+ [AC_MSG_CHECKING([for 'gtest-config'])
34
+ AS_IF([test "x${enable_gtest}" = "xyes"],
35
+ [AS_IF([test -x "${enable_gtest}/scripts/gtest-config"],
36
+ [GTEST_CONFIG="${enable_gtest}/scripts/gtest-config"],
37
+ [GTEST_CONFIG="${enable_gtest}/bin/gtest-config"])
38
+ AS_IF([test -x "${GTEST_CONFIG}"], [],
39
+ [AC_MSG_RESULT([no])
40
+ AC_MSG_ERROR([dnl
41
+ Unable to locate either a built or installed Google Test.
42
+ The specific location '${enable_gtest}' was provided for a built or installed
43
+ Google Test, but no 'gtest-config' script could be found at this location.])
44
+ ])],
45
+ [AC_PATH_PROG([GTEST_CONFIG], [gtest-config])])
46
+ AS_IF([test -x "${GTEST_CONFIG}"],
47
+ [AC_MSG_RESULT([${GTEST_CONFIG}])
48
+ m4_ifval([$1],
49
+ [_gtest_min_version="--min-version=$1"
50
+ AC_MSG_CHECKING([for Google Test at least version >= $1])],
51
+ [_gtest_min_version="--min-version=0"
52
+ AC_MSG_CHECKING([for Google Test])])
53
+ AS_IF([${GTEST_CONFIG} ${_gtest_min_version}],
54
+ [AC_MSG_RESULT([yes])
55
+ HAVE_GTEST='yes'],
56
+ [AC_MSG_RESULT([no])])],
57
+ [AC_MSG_RESULT([no])])
58
+ AS_IF([test "x${HAVE_GTEST}" = "xyes"],
59
+ [GTEST_CPPFLAGS=`${GTEST_CONFIG} --cppflags`
60
+ GTEST_CXXFLAGS=`${GTEST_CONFIG} --cxxflags`
61
+ GTEST_LDFLAGS=`${GTEST_CONFIG} --ldflags`
62
+ GTEST_LIBS=`${GTEST_CONFIG} --libs`
63
+ GTEST_VERSION=`${GTEST_CONFIG} --version`
64
+ AC_DEFINE([HAVE_GTEST],[1],[Defined when Google Test is available.])],
65
+ [AS_IF([test "x${enable_gtest}" = "xyes"],
66
+ [AC_MSG_ERROR([dnl
67
+ Google Test was enabled, but no viable version could be found.])
68
+ ])])])
69
+ AC_SUBST([HAVE_GTEST])
70
+ AM_CONDITIONAL([HAVE_GTEST],[test "x$HAVE_GTEST" = "xyes"])
71
+ AS_IF([test "x$HAVE_GTEST" = "xyes"],
72
+ [m4_ifval([$2], [$2])],
73
+ [m4_ifval([$3], [$3])])
74
+ ])
@@ -0,0 +1,90 @@
1
+ // Copyright 2011 Martin Gieseking <martin.gieseking@uos.de>.
2
+ //
3
+ // Redistribution and use in source and binary forms, with or without
4
+ // modification, are permitted provided that the following conditions are
5
+ // met:
6
+ //
7
+ // * Redistributions of source code must retain the above copyright
8
+ // notice, this list of conditions and the following disclaimer.
9
+ // * Redistributions in binary form must reproduce the above
10
+ // copyright notice, this list of conditions and the following disclaimer
11
+ // in the documentation and/or other materials provided with the
12
+ // distribution.
13
+ // * Neither the name of Google Inc. nor the names of its
14
+ // contributors may be used to endorse or promote products derived from
15
+ // this software without specific prior written permission.
16
+ //
17
+ // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
18
+ // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
19
+ // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
20
+ // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
21
+ // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
22
+ // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
23
+ // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
24
+ // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
25
+ // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26
+ // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27
+ // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28
+
29
+ #include "snappy.h"
30
+ #include "snappy-c.h"
31
+
32
+ extern "C" {
33
+
34
+ snappy_status snappy_compress(const char* input,
35
+ size_t input_length,
36
+ char* compressed,
37
+ size_t *compressed_length) {
38
+ if (*compressed_length < snappy_max_compressed_length(input_length)) {
39
+ return SNAPPY_BUFFER_TOO_SMALL;
40
+ }
41
+ snappy::RawCompress(input, input_length, compressed, compressed_length);
42
+ return SNAPPY_OK;
43
+ }
44
+
45
+ snappy_status snappy_uncompress(const char* compressed,
46
+ size_t compressed_length,
47
+ char* uncompressed,
48
+ size_t* uncompressed_length) {
49
+ size_t real_uncompressed_length;
50
+ if (!snappy::GetUncompressedLength(compressed,
51
+ compressed_length,
52
+ &real_uncompressed_length)) {
53
+ return SNAPPY_INVALID_INPUT;
54
+ }
55
+ if (*uncompressed_length < real_uncompressed_length) {
56
+ return SNAPPY_BUFFER_TOO_SMALL;
57
+ }
58
+ if (!snappy::RawUncompress(compressed, compressed_length, uncompressed)) {
59
+ return SNAPPY_INVALID_INPUT;
60
+ }
61
+ *uncompressed_length = real_uncompressed_length;
62
+ return SNAPPY_OK;
63
+ }
64
+
65
+ size_t snappy_max_compressed_length(size_t source_length) {
66
+ return snappy::MaxCompressedLength(source_length);
67
+ }
68
+
69
+ snappy_status snappy_uncompressed_length(const char *compressed,
70
+ size_t compressed_length,
71
+ size_t *result) {
72
+ if (snappy::GetUncompressedLength(compressed,
73
+ compressed_length,
74
+ result)) {
75
+ return SNAPPY_OK;
76
+ } else {
77
+ return SNAPPY_INVALID_INPUT;
78
+ }
79
+ }
80
+
81
+ snappy_status snappy_validate_compressed_buffer(const char *compressed,
82
+ size_t compressed_length) {
83
+ if (snappy::IsValidCompressedBuffer(compressed, compressed_length)) {
84
+ return SNAPPY_OK;
85
+ } else {
86
+ return SNAPPY_INVALID_INPUT;
87
+ }
88
+ }
89
+
90
+ } // extern "C"
@@ -0,0 +1,138 @@
1
+ /*
2
+ * Copyright 2011 Martin Gieseking <martin.gieseking@uos.de>.
3
+ *
4
+ * Redistribution and use in source and binary forms, with or without
5
+ * modification, are permitted provided that the following conditions are
6
+ * met:
7
+ *
8
+ * * Redistributions of source code must retain the above copyright
9
+ * notice, this list of conditions and the following disclaimer.
10
+ * * Redistributions in binary form must reproduce the above
11
+ * copyright notice, this list of conditions and the following disclaimer
12
+ * in the documentation and/or other materials provided with the
13
+ * distribution.
14
+ * * Neither the name of Google Inc. nor the names of its
15
+ * contributors may be used to endorse or promote products derived from
16
+ * this software without specific prior written permission.
17
+ *
18
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
+ *
30
+ * Plain C interface (a wrapper around the C++ implementation).
31
+ */
32
+
33
+ #ifndef UTIL_SNAPPY_OPENSOURCE_SNAPPY_C_H_
34
+ #define UTIL_SNAPPY_OPENSOURCE_SNAPPY_C_H_
35
+
36
+ #ifdef __cplusplus
37
+ extern "C" {
38
+ #endif
39
+
40
+ #include <stddef.h>
41
+
42
+ /*
43
+ * Return values; see the documentation for each function to know
44
+ * what each can return.
45
+ */
46
+ typedef enum {
47
+ SNAPPY_OK = 0,
48
+ SNAPPY_INVALID_INPUT = 1,
49
+ SNAPPY_BUFFER_TOO_SMALL = 2
50
+ } snappy_status;
51
+
52
+ /*
53
+ * Takes the data stored in "input[0..input_length-1]" and stores
54
+ * it in the array pointed to by "compressed".
55
+ *
56
+ * <compressed_length> signals the space available in "compressed".
57
+ * If it is not at least equal to "snappy_max_compressed_length(input_length)",
58
+ * SNAPPY_BUFFER_TOO_SMALL is returned. After successful compression,
59
+ * <compressed_length> contains the true length of the compressed output,
60
+ * and SNAPPY_OK is returned.
61
+ *
62
+ * Example:
63
+ * size_t output_length = snappy_max_compressed_length(input_length);
64
+ * char* output = (char*)malloc(output_length);
65
+ * if (snappy_compress(input, input_length, output, &output_length)
66
+ * == SNAPPY_OK) {
67
+ * ... Process(output, output_length) ...
68
+ * }
69
+ * free(output);
70
+ */
71
+ snappy_status snappy_compress(const char* input,
72
+ size_t input_length,
73
+ char* compressed,
74
+ size_t* compressed_length);
75
+
76
+ /*
77
+ * Given data in "compressed[0..compressed_length-1]" generated by
78
+ * calling the snappy_compress routine, this routine stores
79
+ * the uncompressed data to
80
+ * uncompressed[0..uncompressed_length-1].
81
+ * Returns failure (a value not equal to SNAPPY_OK) if the message
82
+ * is corrupted and could not be decrypted.
83
+ *
84
+ * <uncompressed_length> signals the space available in "uncompressed".
85
+ * If it is not at least equal to the value returned by
86
+ * snappy_uncompressed_length for this stream, SNAPPY_BUFFER_TOO_SMALL
87
+ * is returned. After successful decompression, <uncompressed_length>
88
+ * contains the true length of the decompressed output.
89
+ *
90
+ * Example:
91
+ * size_t output_length;
92
+ * if (snappy_uncompressed_length(input, input_length, &output_length)
93
+ * != SNAPPY_OK) {
94
+ * ... fail ...
95
+ * }
96
+ * char* output = (char*)malloc(output_length);
97
+ * if (snappy_uncompress(input, input_length, output, &output_length)
98
+ * == SNAPPY_OK) {
99
+ * ... Process(output, output_length) ...
100
+ * }
101
+ * free(output);
102
+ */
103
+ snappy_status snappy_uncompress(const char* compressed,
104
+ size_t compressed_length,
105
+ char* uncompressed,
106
+ size_t* uncompressed_length);
107
+
108
+ /*
109
+ * Returns the maximal size of the compressed representation of
110
+ * input data that is "source_length" bytes in length.
111
+ */
112
+ size_t snappy_max_compressed_length(size_t source_length);
113
+
114
+ /*
115
+ * REQUIRES: "compressed[]" was produced by snappy_compress()
116
+ * Returns SNAPPY_OK and stores the length of the uncompressed data in
117
+ * *result normally. Returns SNAPPY_INVALID_INPUT on parsing error.
118
+ * This operation takes O(1) time.
119
+ */
120
+ snappy_status snappy_uncompressed_length(const char* compressed,
121
+ size_t compressed_length,
122
+ size_t* result);
123
+
124
+ /*
125
+ * Check if the contents of "compressed[]" can be uncompressed successfully.
126
+ * Does not return the uncompressed data; if so, returns SNAPPY_OK,
127
+ * or if not, returns SNAPPY_INVALID_INPUT.
128
+ * Takes time proportional to compressed_length, but is usually at least a
129
+ * factor of four faster than actual decompression.
130
+ */
131
+ snappy_status snappy_validate_compressed_buffer(const char* compressed,
132
+ size_t compressed_length);
133
+
134
+ #ifdef __cplusplus
135
+ } // extern "C"
136
+ #endif
137
+
138
+ #endif /* UTIL_SNAPPY_OPENSOURCE_SNAPPY_C_H_ */
@@ -0,0 +1,150 @@
1
+ // Copyright 2008 Google Inc. All Rights Reserved.
2
+ //
3
+ // Redistribution and use in source and binary forms, with or without
4
+ // modification, are permitted provided that the following conditions are
5
+ // met:
6
+ //
7
+ // * Redistributions of source code must retain the above copyright
8
+ // notice, this list of conditions and the following disclaimer.
9
+ // * Redistributions in binary form must reproduce the above
10
+ // copyright notice, this list of conditions and the following disclaimer
11
+ // in the documentation and/or other materials provided with the
12
+ // distribution.
13
+ // * Neither the name of Google Inc. nor the names of its
14
+ // contributors may be used to endorse or promote products derived from
15
+ // this software without specific prior written permission.
16
+ //
17
+ // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
18
+ // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
19
+ // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
20
+ // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
21
+ // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
22
+ // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
23
+ // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
24
+ // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
25
+ // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26
+ // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27
+ // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28
+ //
29
+ // Internals shared between the Snappy implementation and its unittest.
30
+
31
+ #ifndef UTIL_SNAPPY_SNAPPY_INTERNAL_H_
32
+ #define UTIL_SNAPPY_SNAPPY_INTERNAL_H_
33
+
34
+ #include "snappy-stubs-internal.h"
35
+
36
+ namespace snappy {
37
+ namespace internal {
38
+
39
+ class WorkingMemory {
40
+ public:
41
+ WorkingMemory() : large_table_(NULL) { }
42
+ ~WorkingMemory() { delete[] large_table_; }
43
+
44
+ // Allocates and clears a hash table using memory in "*this",
45
+ // stores the number of buckets in "*table_size" and returns a pointer to
46
+ // the base of the hash table.
47
+ uint16* GetHashTable(size_t input_size, int* table_size);
48
+
49
+ private:
50
+ uint16 small_table_[1<<10]; // 2KB
51
+ uint16* large_table_; // Allocated only when needed
52
+
53
+ DISALLOW_COPY_AND_ASSIGN(WorkingMemory);
54
+ };
55
+
56
+ // Flat array compression that does not emit the "uncompressed length"
57
+ // prefix. Compresses "input" string to the "*op" buffer.
58
+ //
59
+ // REQUIRES: "input_length <= kBlockSize"
60
+ // REQUIRES: "op" points to an array of memory that is at least
61
+ // "MaxCompressedLength(input_length)" in size.
62
+ // REQUIRES: All elements in "table[0..table_size-1]" are initialized to zero.
63
+ // REQUIRES: "table_size" is a power of two
64
+ //
65
+ // Returns an "end" pointer into "op" buffer.
66
+ // "end - op" is the compressed size of "input".
67
+ char* CompressFragment(const char* input,
68
+ size_t input_length,
69
+ char* op,
70
+ uint16* table,
71
+ const int table_size);
72
+
73
+ // Return the largest n such that
74
+ //
75
+ // s1[0,n-1] == s2[0,n-1]
76
+ // and n <= (s2_limit - s2).
77
+ //
78
+ // Does not read *s2_limit or beyond.
79
+ // Does not read *(s1 + (s2_limit - s2)) or beyond.
80
+ // Requires that s2_limit >= s2.
81
+ //
82
+ // Separate implementation for x86_64, for speed. Uses the fact that
83
+ // x86_64 is little endian.
84
+ #if defined(ARCH_K8)
85
+ static inline int FindMatchLength(const char* s1,
86
+ const char* s2,
87
+ const char* s2_limit) {
88
+ assert(s2_limit >= s2);
89
+ int matched = 0;
90
+
91
+ // Find out how long the match is. We loop over the data 64 bits at a
92
+ // time until we find a 64-bit block that doesn't match; then we find
93
+ // the first non-matching bit and use that to calculate the total
94
+ // length of the match.
95
+ while (PREDICT_TRUE(s2 <= s2_limit - 8)) {
96
+ if (PREDICT_FALSE(UNALIGNED_LOAD64(s2) == UNALIGNED_LOAD64(s1 + matched))) {
97
+ s2 += 8;
98
+ matched += 8;
99
+ } else {
100
+ // On current (mid-2008) Opteron models there is a 3% more
101
+ // efficient code sequence to find the first non-matching byte.
102
+ // However, what follows is ~10% better on Intel Core 2 and newer,
103
+ // and we expect AMD's bsf instruction to improve.
104
+ uint64 x = UNALIGNED_LOAD64(s2) ^ UNALIGNED_LOAD64(s1 + matched);
105
+ int matching_bits = Bits::FindLSBSetNonZero64(x);
106
+ matched += matching_bits >> 3;
107
+ return matched;
108
+ }
109
+ }
110
+ while (PREDICT_TRUE(s2 < s2_limit)) {
111
+ if (PREDICT_TRUE(s1[matched] == *s2)) {
112
+ ++s2;
113
+ ++matched;
114
+ } else {
115
+ return matched;
116
+ }
117
+ }
118
+ return matched;
119
+ }
120
+ #else
121
+ static inline int FindMatchLength(const char* s1,
122
+ const char* s2,
123
+ const char* s2_limit) {
124
+ // Implementation based on the x86-64 version, above.
125
+ assert(s2_limit >= s2);
126
+ int matched = 0;
127
+
128
+ while (s2 <= s2_limit - 4 &&
129
+ UNALIGNED_LOAD32(s2) == UNALIGNED_LOAD32(s1 + matched)) {
130
+ s2 += 4;
131
+ matched += 4;
132
+ }
133
+ if (LittleEndian::IsLittleEndian() && s2 <= s2_limit - 4) {
134
+ uint32 x = UNALIGNED_LOAD32(s2) ^ UNALIGNED_LOAD32(s1 + matched);
135
+ int matching_bits = Bits::FindLSBSetNonZero(x);
136
+ matched += matching_bits >> 3;
137
+ } else {
138
+ while ((s2 < s2_limit) && (s1[matched] == *s2)) {
139
+ ++s2;
140
+ ++matched;
141
+ }
142
+ }
143
+ return matched;
144
+ }
145
+ #endif
146
+
147
+ } // end namespace internal
148
+ } // end namespace snappy
149
+
150
+ #endif // UTIL_SNAPPY_SNAPPY_INTERNAL_H_