extlzham 0.0.1.PROTOTYPE

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. checksums.yaml +7 -0
  2. data/LICENSE.md +27 -0
  3. data/README.md +21 -0
  4. data/Rakefile +143 -0
  5. data/contrib/lzham/LICENSE +22 -0
  6. data/contrib/lzham/README.md +209 -0
  7. data/contrib/lzham/include/lzham.h +781 -0
  8. data/contrib/lzham/lzhamcomp/lzham_comp.h +38 -0
  9. data/contrib/lzham/lzhamcomp/lzham_lzbase.cpp +244 -0
  10. data/contrib/lzham/lzhamcomp/lzham_lzbase.h +45 -0
  11. data/contrib/lzham/lzhamcomp/lzham_lzcomp.cpp +608 -0
  12. data/contrib/lzham/lzhamcomp/lzham_lzcomp_internal.cpp +1966 -0
  13. data/contrib/lzham/lzhamcomp/lzham_lzcomp_internal.h +472 -0
  14. data/contrib/lzham/lzhamcomp/lzham_lzcomp_state.cpp +1413 -0
  15. data/contrib/lzham/lzhamcomp/lzham_match_accel.cpp +562 -0
  16. data/contrib/lzham/lzhamcomp/lzham_match_accel.h +146 -0
  17. data/contrib/lzham/lzhamcomp/lzham_null_threading.h +97 -0
  18. data/contrib/lzham/lzhamcomp/lzham_pthreads_threading.cpp +229 -0
  19. data/contrib/lzham/lzhamcomp/lzham_pthreads_threading.h +520 -0
  20. data/contrib/lzham/lzhamcomp/lzham_threading.h +12 -0
  21. data/contrib/lzham/lzhamcomp/lzham_win32_threading.cpp +220 -0
  22. data/contrib/lzham/lzhamcomp/lzham_win32_threading.h +368 -0
  23. data/contrib/lzham/lzhamdecomp/lzham_assert.cpp +66 -0
  24. data/contrib/lzham/lzhamdecomp/lzham_assert.h +40 -0
  25. data/contrib/lzham/lzhamdecomp/lzham_checksum.cpp +73 -0
  26. data/contrib/lzham/lzhamdecomp/lzham_checksum.h +13 -0
  27. data/contrib/lzham/lzhamdecomp/lzham_config.h +23 -0
  28. data/contrib/lzham/lzhamdecomp/lzham_core.h +264 -0
  29. data/contrib/lzham/lzhamdecomp/lzham_decomp.h +37 -0
  30. data/contrib/lzham/lzhamdecomp/lzham_helpers.h +54 -0
  31. data/contrib/lzham/lzhamdecomp/lzham_huffman_codes.cpp +262 -0
  32. data/contrib/lzham/lzhamdecomp/lzham_huffman_codes.h +14 -0
  33. data/contrib/lzham/lzhamdecomp/lzham_lzdecomp.cpp +1527 -0
  34. data/contrib/lzham/lzhamdecomp/lzham_lzdecompbase.cpp +131 -0
  35. data/contrib/lzham/lzhamdecomp/lzham_lzdecompbase.h +89 -0
  36. data/contrib/lzham/lzhamdecomp/lzham_math.h +142 -0
  37. data/contrib/lzham/lzhamdecomp/lzham_mem.cpp +284 -0
  38. data/contrib/lzham/lzhamdecomp/lzham_mem.h +112 -0
  39. data/contrib/lzham/lzhamdecomp/lzham_platform.cpp +157 -0
  40. data/contrib/lzham/lzhamdecomp/lzham_platform.h +284 -0
  41. data/contrib/lzham/lzhamdecomp/lzham_prefix_coding.cpp +351 -0
  42. data/contrib/lzham/lzhamdecomp/lzham_prefix_coding.h +146 -0
  43. data/contrib/lzham/lzhamdecomp/lzham_symbol_codec.cpp +1484 -0
  44. data/contrib/lzham/lzhamdecomp/lzham_symbol_codec.h +556 -0
  45. data/contrib/lzham/lzhamdecomp/lzham_timer.cpp +147 -0
  46. data/contrib/lzham/lzhamdecomp/lzham_timer.h +99 -0
  47. data/contrib/lzham/lzhamdecomp/lzham_traits.h +141 -0
  48. data/contrib/lzham/lzhamdecomp/lzham_types.h +97 -0
  49. data/contrib/lzham/lzhamdecomp/lzham_utils.h +58 -0
  50. data/contrib/lzham/lzhamdecomp/lzham_vector.cpp +75 -0
  51. data/contrib/lzham/lzhamdecomp/lzham_vector.h +588 -0
  52. data/contrib/lzham/lzhamlib/lzham_lib.cpp +179 -0
  53. data/examples/basic.rb +48 -0
  54. data/ext/extconf.rb +26 -0
  55. data/ext/extlzham.c +741 -0
  56. data/gemstub.rb +22 -0
  57. data/lib/extlzham/version.rb +5 -0
  58. data/lib/extlzham.rb +153 -0
  59. metadata +135 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 252b541fe39b87a810a1c32faf9f7a2f2ad57b11
4
+ data.tar.gz: f27f740cb6f08a39837c8fea305da68c63f784d7
5
+ SHA512:
6
+ metadata.gz: 780c3b5c764257695c861a72601c1d74ea92aa68e79059f4bd0d4a8c51dbe68cd978fe9a8d74e4fed9b28e1132d34a7c8457e553b5dc41aa124ff87050da084f
7
+ data.tar.gz: 82704ccee75a3a81480e328c29348fa71d27a9a3bc38c38d9c3ebc365064cda6f212dfe258b4731faf422e349324eb8e6c634e4247bb3378e3ea3b330f3e9533
data/LICENSE.md ADDED
@@ -0,0 +1,27 @@
1
+ extlzham is under The BSD 2-Clause License.
2
+
3
+
4
+ Copyright (c) 2015, dearblue. All rights reserved.
5
+
6
+ Redistribution and use in source and binary forms, with or
7
+ without modification, are permitted provided that the following
8
+ conditions are met:
9
+
10
+ 1. Redistributions of source code must retain the above copyright
11
+ notice, this list of conditions and the following disclaimer.
12
+ 2. Redistributions in binary form must reproduce the above copyright
13
+ notice, this list of conditions and the following disclaimer in
14
+ the documentation and/or other materials provided with the
15
+ distribution.
16
+
17
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
18
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
19
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
20
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
21
+ HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
22
+ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
23
+ LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
24
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
25
+ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md ADDED
@@ -0,0 +1,21 @@
1
+ # encoding:utf-8 ;
2
+
3
+ # extlzham - ruby binding for LZHAM
4
+
5
+ This is ruby binding for compress/decompress library
6
+ [LZHAM (https://github.com/richgel999/lzham\_codec)](https://github.com/richgel999/lzham_codec).
7
+
8
+ * PACKAGE NAME: extlzham
9
+ * AUTHOR: dearblue <dearblue@users.sourceforge.jp>
10
+ * VERSION: 0.0.1.PROTOTYPE
11
+ * LICENSING: 2-clause BSD License
12
+ * REPORT ISSUE TO: <http://sourceforge.jp/projects/rutsubo/ticket/>
13
+ * DEPENDENCY RUBY: ruby-2.0+
14
+ * DEPENDENCY RUBY GEMS: (none)
15
+ * DEPENDENCY LIBRARY: (none)
16
+ * BUNDLED EXTERNAL LIBRARIES:
17
+ * LZHAM: https://github.com/richgel999/lzham\_codec
18
+
19
+ ----
20
+
21
+ [a stub]
data/Rakefile ADDED
@@ -0,0 +1,143 @@
1
+
2
+ require "rake/clean"
3
+
4
+ DOC = FileList["{README,LICENSE,CHANGELOG,Changelog,HISTORY}{,.ja}{,.txt,.rd,.rdoc,.md,.markdown}"] +
5
+ FileList["ext/**/{README,LICENSE,CHANGELOG,Changelog,HISTORY}{,.ja}{,.txt,.rd,.rdoc,.md,.markdown}"]
6
+ #EXT = FileList["ext/**/*.{h,hh,c,cc,cpp,cxx}"] +
7
+ # FileList["ext/externals/**/*"]
8
+ EXT = FileList["ext/**/*"]
9
+ BIN = FileList["bin/*"]
10
+ LIB = FileList["lib/**/*.rb"]
11
+ SPEC = FileList["spec/**/*"]
12
+ EXAMPLE = FileList["examples/**/*"]
13
+ RAKEFILE = [File.basename(__FILE__), "gemstub.rb"]
14
+ EXTRA = []
15
+
16
+ load "gemstub.rb"
17
+
18
+ EXTCONF = FileList["ext/extconf.rb"]
19
+ EXTCONF.reject! { |n| !File.file?(n) }
20
+ GEMSTUB.extensions += EXTCONF
21
+ GEMSTUB.executables += FileList["bin/*"].map { |n| File.basename n }
22
+ GEMSTUB.executables.sort!
23
+
24
+ GEMFILE = "#{GEMSTUB.name}-#{GEMSTUB.version}.gem"
25
+ GEMSPEC = "#{GEMSTUB.name}.gemspec"
26
+
27
+ GEMSTUB.files += DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + RAKEFILE + EXTRA
28
+ GEMSTUB.files.sort!
29
+ GEMSTUB.rdoc_options ||= %w(--charset UTF-8)
30
+ GEMSTUB.extra_rdoc_files += DOC + LIB + EXT.reject { |n| n.include?("/externals/") || !%w(.h .hh .c .cc .cpp .cxx).include?(File.extname(n)) }
31
+ GEMSTUB.extra_rdoc_files.sort!
32
+
33
+ CLEAN << GEMSPEC
34
+ CLOBBER << GEMFILE
35
+
36
+ task :default => :all
37
+
38
+
39
+ unless EXTCONF.empty?
40
+ RUBYSET ||= (ENV["RUBYSET"] || "").split(",")
41
+
42
+ if RUBYSET.nil? || RUBYSET.empty?
43
+ $stderr.puts <<-EOS
44
+ #{__FILE__}:
45
+ |
46
+ | If you want binary gem package, launch rake with ``RUBYSET`` enviroment
47
+ | variable for set ruby interpreters by comma separated.
48
+ |
49
+ | e.g.) $ rake RUBYSET=ruby
50
+ | or) $ rake RUBYSET=ruby20,ruby21,ruby22
51
+ |
52
+ EOS
53
+ else
54
+ platforms = RUBYSET.map { |ruby| `#{ruby} --disable gems -rrbconfig -e "puts RbConfig::CONFIG['arch']"`.chomp }
55
+ platforms1 = platforms.uniq
56
+ unless platforms1.size == 1 && !platforms1[0].empty?
57
+ raise "different platforms (#{Hash[*RUBYSET.zip(platforms).flatten].inspect})"
58
+ end
59
+ PLATFORM = platforms1[0]
60
+
61
+ RUBY_VERSIONS = RUBYSET.map do |ruby|
62
+ ver = `#{ruby} --disable gem -rrbconfig -e "puts RbConfig::CONFIG['ruby_version']"`.chomp
63
+ raise "failed ruby checking - ``#{ruby}''" unless $?.success?
64
+ [ver, ruby]
65
+ end
66
+ SOFILES_SET = RUBY_VERSIONS.map { |(ver, ruby)| ["lib/#{ver}/#{GEMSTUB.name}.so", ruby] }
67
+ SOFILES = SOFILES_SET.map { |(lib, ruby)| lib }
68
+
69
+ GEMSTUB_NATIVE = GEMSTUB.dup
70
+ GEMSTUB_NATIVE.files += SOFILES
71
+ GEMSTUB_NATIVE.platform = Gem::Platform.new(PLATFORM).to_s
72
+ GEMSTUB_NATIVE.extensions.clear
73
+ GEMFILE_NATIVE = "#{GEMSTUB_NATIVE.name}-#{GEMSTUB_NATIVE.version}-#{GEMSTUB_NATIVE.platform}.gem"
74
+ GEMSPEC_NATIVE = "#{GEMSTUB_NATIVE.name}-#{GEMSTUB_NATIVE.platform}.gemspec"
75
+
76
+ task :all => ["native-gem", GEMFILE]
77
+
78
+ desc "build binary gem package"
79
+ task "native-gem" => GEMFILE_NATIVE
80
+
81
+ desc "generate binary gemspec"
82
+ task "native-gemspec" => GEMSPEC_NATIVE
83
+
84
+ file GEMFILE_NATIVE => DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + SOFILES + RAKEFILE + [GEMSPEC_NATIVE] do
85
+ sh "gem build #{GEMSPEC_NATIVE}"
86
+ end
87
+
88
+ file GEMSPEC_NATIVE => __FILE__ do
89
+ File.write(GEMSPEC_NATIVE, GEMSTUB_NATIVE.to_ruby, mode: "wb")
90
+ end
91
+
92
+ SOFILES_SET.each do |(soname, ruby)|
93
+ sodir = File.dirname(soname)
94
+ makefile = File.join(sodir, "Makefile")
95
+
96
+ CLEAN << GEMSPEC_NATIVE << sodir
97
+ CLOBBER << GEMFILE_NATIVE
98
+
99
+ directory sodir
100
+
101
+ desc "generate Makefile for binary extension library"
102
+ file makefile => [sodir] + EXTCONF do
103
+ cd sodir do
104
+ sh "#{ruby} ../../#{EXTCONF[0]} \"--ruby=#{ruby}\""
105
+ end
106
+ end
107
+
108
+ desc "build binary extension library"
109
+ file soname => [makefile] + EXT do
110
+ cd sodir do
111
+ sh "make"
112
+ end
113
+ end
114
+ end
115
+ end
116
+ end
117
+
118
+
119
+ task :all => GEMFILE
120
+
121
+ desc "generate local rdoc"
122
+ task :rdoc => DOC + EXT + LIB do
123
+ sh *(%w(rdoc) + GEMSTUB.rdoc_options + DOC + EXT + LIB)
124
+ end
125
+
126
+ desc "launch rspec"
127
+ task rspec: :all do
128
+ sh "rspec"
129
+ end
130
+
131
+ desc "build gem package"
132
+ task gem: GEMFILE
133
+
134
+ desc "generate gemspec"
135
+ task gemspec: GEMSPEC
136
+
137
+ file GEMFILE => DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + RAKEFILE + [GEMSPEC] do
138
+ sh "gem build #{GEMSPEC}"
139
+ end
140
+
141
+ file GEMSPEC => RAKEFILE do
142
+ File.write(GEMSPEC, GEMSTUB.to_ruby, mode: "wb")
143
+ end
@@ -0,0 +1,22 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2009-2015 Richard Geldreich, Jr. <richgel99@gmail.com>
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
@@ -0,0 +1,209 @@
1
+ LZHAM - Lossless Data Compression Codec
2
+ =============
3
+
4
+ <p>Copyright (c) 2009-2015 Richard Geldreich, Jr. - richgel99@gmail.com - MIT License</p>
5
+
6
+ <p>LZHAM is a lossless data compression codec written in C/C++ (specifically C++03), with a compression ratio similar to LZMA but with 1.5x-8x faster decompression speed. It officially supports Linux x86/x64, Windows x86/x64,
7
+ OSX, and iOS, with Android support on the way.</p>
8
+
9
+ <p>Some slightly out of date API documentation is here (I'll be migrating this to github): https://code.google.com/p/lzham/wiki/API_Docs</p>
10
+
11
+ <h3>Introduction</h3>
12
+
13
+ <p>LZHAM is a lossless (LZ based) data compression codec optimized for particularly fast decompression at very high compression ratios with a zlib compatible API.
14
+ It's been developed over a period of 3 years and alpha versions have already shipped in many products. (The alpha is here: https://code.google.com/p/lzham/)
15
+ LZHAM's decompressor is slower than zlib's, but generally much faster than LZMA's, with a compression ratio that is typically within a few percent of LZMA's and sometimes better.</p>
16
+
17
+ <p>LZHAM's compressor is intended for offline use, but it is tested alongside the decompressor on mobile devices and is usable on the faster settings.</p>
18
+
19
+ <p>LZHAM's decompressor currently has a higher cost to initialize than LZMA, so the threshold where LZHAM is typically faster vs. LZMA decompression is between 1000-13,000 of
20
+ *compressed* output bytes, depending on the platform. It is not a good small block compressor: it likes large (10KB-15KB minimum) blocks.</p>
21
+
22
+ <p>LZHAM has simple support for patch files (delta compression), but this is a side benefit of its design, not its primary use case. Internally it supports LZ matches up
23
+ to ~64KB and very large dictionaries (up to .5 GB).</p>
24
+
25
+ <p>LZHAM may be valuable to you if you compress data offline and distribute it to many customers, care about read/download times, and decompression speed/low CPU+power use
26
+ are important to you.</p>
27
+
28
+ <p>I've been profiling LZHAM vs. LZMA and publishing the results on my blog: http://richg42.blogspot.com</p>
29
+
30
+ <p>Some independent benchmarks of the previous alpha versions: http://heartofcomp.altervista.org/MOC/MOCADE.htm, http://mattmahoney.net/dc/text.html</p>
31
+
32
+ <p>LZHAM has been integrated into the 7zip archiver (command line and GUI) as a custom codec plugin: http://richg42.blogspot.com/2015/02/lzham-10-integrated-into-7zip-command.html</p>
33
+
34
+ <h3>10GB Benchmark Results</h3>
35
+
36
+ Results with [7zip-LZHAM 9.38 32-bit](http://richg42.blogspot.com/2015/02/7zip-938-custom-codec-plugin-for-lzham.html) (64MB dictionary) on [Matt Mahoney's 10GB benchmark](http://mattmahoney.net/dc/10gb.html):
37
+
38
+ ```
39
+ LZHAM (-mx=8): 3,577,047,629 Archive Test Time: 70.652 secs
40
+ LZHAM (-mx=9): 3,573,782,721 Archive Test Time: 71.292 secs
41
+ LZMA (-mx=9): 3,560,052,414 Archive Test Time: 223.050 secs
42
+ 7z .ZIP : 4,681,291,655 Archive Test Time: 73.304 secs (unzip v6 x64 test time: 61.074 secs)
43
+ ```
44
+
45
+ <h3>Most Common Question: So how does it compare to other libs like LZ4?</h3>
46
+
47
+ There is no single compression algorithm that perfectly suites all use cases and practical constraints. LZ4 and LZHAM are tools which lie at completely opposite ends of the spectrum:
48
+
49
+ * LZ4: A symmetrical codec with very fast compression and decompression but very low ratios. Its compression ratio is typically less than even zlib's (which uses a 21+ year old algorithm).
50
+ LZ4 does a good job of trading off a large amount of compression ratio for very fast overall throughput.
51
+ Usage example: Reading LZMA/LZHAM/etc. compressed data from the network and decompressing it, then caching this data locally on disk using LZ4 to reduce disk usage and decrease future loading times.
52
+
53
+ * LZHAM: A very asymmetrical codec with slow compression speed, but with a very competitive (LZMA-like) compression ratio and reasonably fast decompression speeds (slower than zlib, but faster than LZMA).
54
+ LZHAM trades off a lot of compression throughput for very high ratios and higher decompression throughput relative to other codecs in its ratio class (which is LZMA, which runs circles around LZ4's ratio).
55
+ Usage example: Compress your product's data once on a build server, distribute it to end users over a slow media like the internet, then decompress it on the end user's device.
56
+
57
+ <h3>How Much Memory Does It Need?</h3>
58
+
59
+ For decompression it's easy to compute:
60
+ * Buffered mode: decomp_mem = dict_size + ~34KB for work tables
61
+ * Unbuffered mode: decomp_mem = ~34KB
62
+
63
+ I'll be honest here, the compressor is currently an angry beast when it comes to memory. The amount needed depends mostly on the compression level and dict. size. It's *approximately* (max_probes=128 at level -m4):
64
+ comp_mem = min(512 * 1024, dict_size / 8) * max_probes * 6 + dict_size * 9 + 22020096
65
+
66
+ Compression mem usage examples from Windows lzhamtest_x64 (note the equation is pretty off for small dictionary sizes):
67
+ * 32KB: 11MB
68
+ * 128KB: 21MB
69
+ * 512KB: 63MB
70
+ * 1MB: 118MB
71
+ * 8MB: 478MB
72
+ * 64MB: 982MB
73
+ * 128MB: 1558MB
74
+ * 256MB: 2710MB
75
+ * 512MB: 5014MB
76
+
77
+ <h3>Compressed Bitstream Compatibility</h3>
78
+
79
+ <p>v1.0's bitstream format is now locked in place, so any future v1.x releases will be backwards/forward compatible with compressed files
80
+ written with v1.0. The only thing that could change this are critical bugfixes.</p>
81
+
82
+ <p>Note LZHAM v1.x bitstreams are NOT backwards compatible with any of the previous alpha versions on Google Code.</p>
83
+
84
+ <h3>Platforms/Compiler Support</h3>
85
+
86
+ LZHAM currently officially supports x86/x64 Linux, iOS, OSX, and Windows x86/x64. At one time the codec compiled and ran fine on Xbox 360 (PPC, big endian). Android support is coming next.
87
+ It should be easy to retarget by modifying the macros in lzham_core.h.</p>
88
+
89
+ <p>LZHAM has optional support for multithreaded compression. It supports gcc built-ins or MSVC intrinsics for atomic ops. For threading, it supports OSX
90
+ specific Pthreads, generic Pthreads, or Windows API's.</p>
91
+
92
+ <p>For compilers, I've tested with gcc, clang, and MSVC 2008, 2010, and 2013. In previous alphas I also compiled with TDM-GCC x64.</p>
93
+
94
+ <h3>API</h3>
95
+
96
+ LZHAM supports streaming or memory to memory compression/decompression. See include/lzham.h. LZHAM can be linked statically or dynamically, just study the
97
+ headers and the lzhamtest project.
98
+ On Linux/OSX, it's only been tested with static linking so far.
99
+
100
+ LZHAM also supports a usable subset of the zlib API with extensions, either include/zlib.h or #define LZHAM_DEFINE_ZLIB_API and use include/lzham.h.
101
+
102
+ <h3>Usage Tips</h3>
103
+
104
+ * Always try to use the smallest dictionary size that makes sense for the file or block you are compressing, i.e. don't use a 128MB dictionary for a 15KB file. The codec
105
+ doesn't automatically choose for you because in streaming scenarios it has no idea how large the file or block will be.
106
+ * The larger the dictionary, the more RAM is required during compression and decompression. I would avoid using more than 8-16MB dictionaries on iOS.
107
+ * For faster decompression, prefer "unbuffered" decompression mode vs. buffered decompression (avoids a dictionary alloc and extra memcpy()'s), and disable adler-32 checking. Also, use the built-in LZHAM API's, not the
108
+ zlib-style API's for fastest decompression.
109
+ * Experiment with the "m_table_update_rate" compression/decompression parameter. This setting trades off a small amount of ratio for faster decompression.
110
+ Note the m_table_update_rate decompression parameter MUST match the setting used during compression (same for the dictionary size). It's up to you to store this info somehow.
111
+ * Avoid using LZHAM on small *compressed* blocks, where small is 1KB-10KB compressed bytes depending on the platform. LZHAM's decompressor is only faster than LZMA's beyond the small block threshold.
112
+ Optimizing LZHAM's decompressor to reduce its startup time relative to LZMA is a high priority.
113
+ * For best compression (I've seen up to ~4% better), enable the compressor's "extreme" parser, which is much slower but finds cheaper paths through a much denser parse graph.
114
+ Note the extreme parser can greatly slow down on files containing large amounts of repeated data/strings, but it is guaranteed to finish.
115
+ * The compressor's m_level parameter can make a big impact on compression speed. Level 0 (LZHAM_COMP_LEVEL_FASTEST) uses a much simpler greedy parser, and the other levels use
116
+ near-optimal parsing with different heuristic settings.
117
+ * Check out the compressor/decompressor reinit() API's, which are useful if you'll be compressing or decompressing many times. Using the reinit() API's is a lot cheaper than fully
118
+ initializing/deinitializing the entire codec every time.
119
+ * LZHAM's compressor is no speed demon. It's usually slower than LZMA's, sometimes by a wide (~2x slower or so) margin. In "extreme" parsing mode, it can be many times slower.
120
+ This codec was designed with offline compression in mind.
121
+ * One significant difference between LZMA and LZHAM is how uncompressible files are handled. LZMA usually expands uncompressible files, and its decompressor can bog down and run extremely
122
+ slowly on uncompressible data. LZHAM internally detects when each 512KB block is uncompressible and stores these blocks as uncompressed bytes instead.
123
+ LZHAM's literal decoding is significantly faster than LZMA's, so the more plain literals in the output stream, the faster LZHAM's decompressor runs vs. LZMA's.
124
+ * General advice (applies to LZMA and other codecs too): If you are compressing large amounts of serialized game assets, sort the serialized data by asset type and compress the whole thing as a single large "solid" block of data.
125
+ Don't compress each individual asset, this will kill your ratio and have a higher decompression startup cost. If you need random access, consider compressing the assets lumped
126
+ together into groups of a few hundred kilobytes (or whatever) each.
127
+ * LZHAM is a raw codec. It doesn't include any sort of preprocessing: EXE rel to abs jump transformation, audio predictors, etc. That's up to you
128
+ to do, before compression.
129
+
130
+ <h3>Codec Test App</h3>
131
+
132
+ lzhamtest_x86/x64 is a simple command line test program that uses the LZHAM codec to compress/decompress single files.
133
+ lzhamtest is not intended as a file archiver or end user tool, it's just a simple testbed.
134
+
135
+ -- Usage examples:
136
+
137
+ - Compress single file "source_filename" to "compressed_filename":
138
+ lzhamtest_x64 c source_filename compressed_filename
139
+
140
+ - Decompress single file "compressed_filename" to "decompressed_filename":
141
+ lzhamtest_x64 d compressed_filename decompressed_filename
142
+
143
+ - Compress single file "source_filename" to "compressed_filename", then verify the compressed file decompresses properly to the source file:
144
+ lzhamtest_x64 -v c source_filename compressed_filename
145
+
146
+ - Recursively compress all files under specified directory and verify that each file decompresses properly:
147
+ lzhamtest_x64 -v a c:\source_path
148
+
149
+ -- Options
150
+
151
+ - Set dictionary size used during compressed to 1MB (2^20):
152
+ lzhamtest_x64 -d20 c source_filename compressed_filename
153
+
154
+ Valid dictionary sizes are [15,26] for x86, and [15,29] for x64. (See LZHAM_MIN_DICT_SIZE_LOG2, etc. defines in include/lzham.h.)
155
+ The x86 version defaults to 64MB (26), and the x64 version defaults to 256MB (28). I wouldn't recommend setting the dictionary size to
156
+ 512MB unless your machine has more than 4GB of physical memory.
157
+
158
+ - Set compression level to fastest:
159
+ lzhamtest_x64 -m0 c source_filename compressed_filename
160
+
161
+ - Set compression level to uber (the default):
162
+ lzhamtest_x64 -m4 c source_filename compressed_filename
163
+
164
+ - For best possible compression, use -d29 to enable the largest dictionary size (512MB) and the -x option which enables more rigorous (but ~4X slower!) parsing:
165
+ lzhamtest_x64 -d29 -x -m4 c source_filename compressed_filename
166
+
167
+ See lzhamtest_x86/x64.exe's help text for more command line parameters.
168
+
169
+ <h3>Compiling LZHAM</h3>
170
+
171
+ - Linux: Use "cmake ." then "make". The cmake script only supports Linux at the moment. (Sorry, working on build systems is a drag.)
172
+ - OSX/iOS: Use the included XCode project. (NOTE: I haven't merged this over yet. It's coming!)
173
+ - Windows: Use the included VS 2010 project
174
+
175
+ IMPORTANT: With clang or gcc compile LZHAM with "No strict aliasing" ENABLED: -fno-strict-aliasing
176
+
177
+ I DO NOT test or develop the codec with strict aliasing:
178
+ * https://lkml.org/lkml/2003/2/26/158
179
+ * http://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-horror-stories
180
+
181
+ It might work fine, I don't know yet. This is usually not a problem with MSVC, which defaults to strict aliasing being off.
182
+
183
+ <h3>ANSI C/C++</h3>
184
+
185
+ LZHAM supports compiling as plain vanilla ANSI C/C++. To see how the codec configures itself check out lzham_core.h and search for "LZHAM_ANSI_CPLUSPLUS".
186
+ All platform specific stuff (unaligned loads, threading, atomic ops, etc.) should be disabled when this macro is defined. Note, the compressor doesn't use threads
187
+ or atomic operations when built this way so it's going to be pretty slow. (The compressor was built from the ground up to be threaded.)
188
+
189
+ <h3>Known Problems</h3>
190
+
191
+ <p>LZHAM's decompressor is like a drag racer that needs time to get up to speed. LZHAM is not intended or optimized to be used on "small" blocks of data (less
192
+ than ~10,000 bytes of *compressed* data on desktops, or around 1,000-5,000 on iOS). If your usage case involves calling the codec over and over with tiny blocks
193
+ then LZMA, LZ4, Deflate, etc. are probably better choices.</p>
194
+
195
+ <p>The decompressor still takes too long to init vs. LZMA. On iOS the cost is not that bad, but on desktop the cost is high. I have reduced the startup cost vs. the
196
+ alpha but there's still work to do.</p>
197
+
198
+ <p>The compressor is slower than I would like, and doesn't scale as well as it could. I added a reinit() method to make it initialize faster, but it's not a speed demon.
199
+ My focus has been on ratio and decompression speed.</p>
200
+
201
+ <p>I use tabs=3 spaces, but I think some actual tabs got in the code. I need to run the sources through ClangFormat or whatever.</p>
202
+
203
+ <h3>Special Thanks</h3>
204
+
205
+ <p>Thanks to everyone at the http://encode.ru forums. I read these forums as a lurker before working on LZHAM, and I studied every LZ related
206
+ post I could get my hands on. Especially anything related to LZ optimal parsing, which still seems like a black art. LZHAM was my way of
207
+ learning how to implement optimal parsing (and you can see this if you study the progress I made in the early alphas on Google Code).</p>
208
+
209
+ <p>Also, thanks to Igor Pavlov, the original creator of LZMA and 7zip, for advancing the start of the art in LZ compression.</p>