extlzham 0.0.1.PROTOTYPE
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/LICENSE.md +27 -0
- data/README.md +21 -0
- data/Rakefile +143 -0
- data/contrib/lzham/LICENSE +22 -0
- data/contrib/lzham/README.md +209 -0
- data/contrib/lzham/include/lzham.h +781 -0
- data/contrib/lzham/lzhamcomp/lzham_comp.h +38 -0
- data/contrib/lzham/lzhamcomp/lzham_lzbase.cpp +244 -0
- data/contrib/lzham/lzhamcomp/lzham_lzbase.h +45 -0
- data/contrib/lzham/lzhamcomp/lzham_lzcomp.cpp +608 -0
- data/contrib/lzham/lzhamcomp/lzham_lzcomp_internal.cpp +1966 -0
- data/contrib/lzham/lzhamcomp/lzham_lzcomp_internal.h +472 -0
- data/contrib/lzham/lzhamcomp/lzham_lzcomp_state.cpp +1413 -0
- data/contrib/lzham/lzhamcomp/lzham_match_accel.cpp +562 -0
- data/contrib/lzham/lzhamcomp/lzham_match_accel.h +146 -0
- data/contrib/lzham/lzhamcomp/lzham_null_threading.h +97 -0
- data/contrib/lzham/lzhamcomp/lzham_pthreads_threading.cpp +229 -0
- data/contrib/lzham/lzhamcomp/lzham_pthreads_threading.h +520 -0
- data/contrib/lzham/lzhamcomp/lzham_threading.h +12 -0
- data/contrib/lzham/lzhamcomp/lzham_win32_threading.cpp +220 -0
- data/contrib/lzham/lzhamcomp/lzham_win32_threading.h +368 -0
- data/contrib/lzham/lzhamdecomp/lzham_assert.cpp +66 -0
- data/contrib/lzham/lzhamdecomp/lzham_assert.h +40 -0
- data/contrib/lzham/lzhamdecomp/lzham_checksum.cpp +73 -0
- data/contrib/lzham/lzhamdecomp/lzham_checksum.h +13 -0
- data/contrib/lzham/lzhamdecomp/lzham_config.h +23 -0
- data/contrib/lzham/lzhamdecomp/lzham_core.h +264 -0
- data/contrib/lzham/lzhamdecomp/lzham_decomp.h +37 -0
- data/contrib/lzham/lzhamdecomp/lzham_helpers.h +54 -0
- data/contrib/lzham/lzhamdecomp/lzham_huffman_codes.cpp +262 -0
- data/contrib/lzham/lzhamdecomp/lzham_huffman_codes.h +14 -0
- data/contrib/lzham/lzhamdecomp/lzham_lzdecomp.cpp +1527 -0
- data/contrib/lzham/lzhamdecomp/lzham_lzdecompbase.cpp +131 -0
- data/contrib/lzham/lzhamdecomp/lzham_lzdecompbase.h +89 -0
- data/contrib/lzham/lzhamdecomp/lzham_math.h +142 -0
- data/contrib/lzham/lzhamdecomp/lzham_mem.cpp +284 -0
- data/contrib/lzham/lzhamdecomp/lzham_mem.h +112 -0
- data/contrib/lzham/lzhamdecomp/lzham_platform.cpp +157 -0
- data/contrib/lzham/lzhamdecomp/lzham_platform.h +284 -0
- data/contrib/lzham/lzhamdecomp/lzham_prefix_coding.cpp +351 -0
- data/contrib/lzham/lzhamdecomp/lzham_prefix_coding.h +146 -0
- data/contrib/lzham/lzhamdecomp/lzham_symbol_codec.cpp +1484 -0
- data/contrib/lzham/lzhamdecomp/lzham_symbol_codec.h +556 -0
- data/contrib/lzham/lzhamdecomp/lzham_timer.cpp +147 -0
- data/contrib/lzham/lzhamdecomp/lzham_timer.h +99 -0
- data/contrib/lzham/lzhamdecomp/lzham_traits.h +141 -0
- data/contrib/lzham/lzhamdecomp/lzham_types.h +97 -0
- data/contrib/lzham/lzhamdecomp/lzham_utils.h +58 -0
- data/contrib/lzham/lzhamdecomp/lzham_vector.cpp +75 -0
- data/contrib/lzham/lzhamdecomp/lzham_vector.h +588 -0
- data/contrib/lzham/lzhamlib/lzham_lib.cpp +179 -0
- data/examples/basic.rb +48 -0
- data/ext/extconf.rb +26 -0
- data/ext/extlzham.c +741 -0
- data/gemstub.rb +22 -0
- data/lib/extlzham/version.rb +5 -0
- data/lib/extlzham.rb +153 -0
- metadata +135 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 252b541fe39b87a810a1c32faf9f7a2f2ad57b11
|
4
|
+
data.tar.gz: f27f740cb6f08a39837c8fea305da68c63f784d7
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 780c3b5c764257695c861a72601c1d74ea92aa68e79059f4bd0d4a8c51dbe68cd978fe9a8d74e4fed9b28e1132d34a7c8457e553b5dc41aa124ff87050da084f
|
7
|
+
data.tar.gz: 82704ccee75a3a81480e328c29348fa71d27a9a3bc38c38d9c3ebc365064cda6f212dfe258b4731faf422e349324eb8e6c634e4247bb3378e3ea3b330f3e9533
|
data/LICENSE.md
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
extlzham is under The BSD 2-Clause License.
|
2
|
+
|
3
|
+
|
4
|
+
Copyright (c) 2015, dearblue. All rights reserved.
|
5
|
+
|
6
|
+
Redistribution and use in source and binary forms, with or
|
7
|
+
without modification, are permitted provided that the following
|
8
|
+
conditions are met:
|
9
|
+
|
10
|
+
1. Redistributions of source code must retain the above copyright
|
11
|
+
notice, this list of conditions and the following disclaimer.
|
12
|
+
2. Redistributions in binary form must reproduce the above copyright
|
13
|
+
notice, this list of conditions and the following disclaimer in
|
14
|
+
the documentation and/or other materials provided with the
|
15
|
+
distribution.
|
16
|
+
|
17
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
18
|
+
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
19
|
+
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
20
|
+
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
21
|
+
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
22
|
+
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
23
|
+
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
24
|
+
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
25
|
+
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
26
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
27
|
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
data/README.md
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
# encoding:utf-8 ;
|
2
|
+
|
3
|
+
# extlzham - ruby binding for LZHAM
|
4
|
+
|
5
|
+
This is ruby binding for compress/decompress library
|
6
|
+
[LZHAM (https://github.com/richgel999/lzham\_codec)](https://github.com/richgel999/lzham_codec).
|
7
|
+
|
8
|
+
* PACKAGE NAME: extlzham
|
9
|
+
* AUTHOR: dearblue <dearblue@users.sourceforge.jp>
|
10
|
+
* VERSION: 0.0.1.PROTOTYPE
|
11
|
+
* LICENSING: 2-clause BSD License
|
12
|
+
* REPORT ISSUE TO: <http://sourceforge.jp/projects/rutsubo/ticket/>
|
13
|
+
* DEPENDENCY RUBY: ruby-2.0+
|
14
|
+
* DEPENDENCY RUBY GEMS: (none)
|
15
|
+
* DEPENDENCY LIBRARY: (none)
|
16
|
+
* BUNDLED EXTERNAL LIBRARIES:
|
17
|
+
* LZHAM: https://github.com/richgel999/lzham\_codec
|
18
|
+
|
19
|
+
----
|
20
|
+
|
21
|
+
[a stub]
|
data/Rakefile
ADDED
@@ -0,0 +1,143 @@
|
|
1
|
+
|
2
|
+
require "rake/clean"
|
3
|
+
|
4
|
+
DOC = FileList["{README,LICENSE,CHANGELOG,Changelog,HISTORY}{,.ja}{,.txt,.rd,.rdoc,.md,.markdown}"] +
|
5
|
+
FileList["ext/**/{README,LICENSE,CHANGELOG,Changelog,HISTORY}{,.ja}{,.txt,.rd,.rdoc,.md,.markdown}"]
|
6
|
+
#EXT = FileList["ext/**/*.{h,hh,c,cc,cpp,cxx}"] +
|
7
|
+
# FileList["ext/externals/**/*"]
|
8
|
+
EXT = FileList["ext/**/*"]
|
9
|
+
BIN = FileList["bin/*"]
|
10
|
+
LIB = FileList["lib/**/*.rb"]
|
11
|
+
SPEC = FileList["spec/**/*"]
|
12
|
+
EXAMPLE = FileList["examples/**/*"]
|
13
|
+
RAKEFILE = [File.basename(__FILE__), "gemstub.rb"]
|
14
|
+
EXTRA = []
|
15
|
+
|
16
|
+
load "gemstub.rb"
|
17
|
+
|
18
|
+
EXTCONF = FileList["ext/extconf.rb"]
|
19
|
+
EXTCONF.reject! { |n| !File.file?(n) }
|
20
|
+
GEMSTUB.extensions += EXTCONF
|
21
|
+
GEMSTUB.executables += FileList["bin/*"].map { |n| File.basename n }
|
22
|
+
GEMSTUB.executables.sort!
|
23
|
+
|
24
|
+
GEMFILE = "#{GEMSTUB.name}-#{GEMSTUB.version}.gem"
|
25
|
+
GEMSPEC = "#{GEMSTUB.name}.gemspec"
|
26
|
+
|
27
|
+
GEMSTUB.files += DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + RAKEFILE + EXTRA
|
28
|
+
GEMSTUB.files.sort!
|
29
|
+
GEMSTUB.rdoc_options ||= %w(--charset UTF-8)
|
30
|
+
GEMSTUB.extra_rdoc_files += DOC + LIB + EXT.reject { |n| n.include?("/externals/") || !%w(.h .hh .c .cc .cpp .cxx).include?(File.extname(n)) }
|
31
|
+
GEMSTUB.extra_rdoc_files.sort!
|
32
|
+
|
33
|
+
CLEAN << GEMSPEC
|
34
|
+
CLOBBER << GEMFILE
|
35
|
+
|
36
|
+
task :default => :all
|
37
|
+
|
38
|
+
|
39
|
+
unless EXTCONF.empty?
|
40
|
+
RUBYSET ||= (ENV["RUBYSET"] || "").split(",")
|
41
|
+
|
42
|
+
if RUBYSET.nil? || RUBYSET.empty?
|
43
|
+
$stderr.puts <<-EOS
|
44
|
+
#{__FILE__}:
|
45
|
+
|
|
46
|
+
| If you want binary gem package, launch rake with ``RUBYSET`` enviroment
|
47
|
+
| variable for set ruby interpreters by comma separated.
|
48
|
+
|
|
49
|
+
| e.g.) $ rake RUBYSET=ruby
|
50
|
+
| or) $ rake RUBYSET=ruby20,ruby21,ruby22
|
51
|
+
|
|
52
|
+
EOS
|
53
|
+
else
|
54
|
+
platforms = RUBYSET.map { |ruby| `#{ruby} --disable gems -rrbconfig -e "puts RbConfig::CONFIG['arch']"`.chomp }
|
55
|
+
platforms1 = platforms.uniq
|
56
|
+
unless platforms1.size == 1 && !platforms1[0].empty?
|
57
|
+
raise "different platforms (#{Hash[*RUBYSET.zip(platforms).flatten].inspect})"
|
58
|
+
end
|
59
|
+
PLATFORM = platforms1[0]
|
60
|
+
|
61
|
+
RUBY_VERSIONS = RUBYSET.map do |ruby|
|
62
|
+
ver = `#{ruby} --disable gem -rrbconfig -e "puts RbConfig::CONFIG['ruby_version']"`.chomp
|
63
|
+
raise "failed ruby checking - ``#{ruby}''" unless $?.success?
|
64
|
+
[ver, ruby]
|
65
|
+
end
|
66
|
+
SOFILES_SET = RUBY_VERSIONS.map { |(ver, ruby)| ["lib/#{ver}/#{GEMSTUB.name}.so", ruby] }
|
67
|
+
SOFILES = SOFILES_SET.map { |(lib, ruby)| lib }
|
68
|
+
|
69
|
+
GEMSTUB_NATIVE = GEMSTUB.dup
|
70
|
+
GEMSTUB_NATIVE.files += SOFILES
|
71
|
+
GEMSTUB_NATIVE.platform = Gem::Platform.new(PLATFORM).to_s
|
72
|
+
GEMSTUB_NATIVE.extensions.clear
|
73
|
+
GEMFILE_NATIVE = "#{GEMSTUB_NATIVE.name}-#{GEMSTUB_NATIVE.version}-#{GEMSTUB_NATIVE.platform}.gem"
|
74
|
+
GEMSPEC_NATIVE = "#{GEMSTUB_NATIVE.name}-#{GEMSTUB_NATIVE.platform}.gemspec"
|
75
|
+
|
76
|
+
task :all => ["native-gem", GEMFILE]
|
77
|
+
|
78
|
+
desc "build binary gem package"
|
79
|
+
task "native-gem" => GEMFILE_NATIVE
|
80
|
+
|
81
|
+
desc "generate binary gemspec"
|
82
|
+
task "native-gemspec" => GEMSPEC_NATIVE
|
83
|
+
|
84
|
+
file GEMFILE_NATIVE => DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + SOFILES + RAKEFILE + [GEMSPEC_NATIVE] do
|
85
|
+
sh "gem build #{GEMSPEC_NATIVE}"
|
86
|
+
end
|
87
|
+
|
88
|
+
file GEMSPEC_NATIVE => __FILE__ do
|
89
|
+
File.write(GEMSPEC_NATIVE, GEMSTUB_NATIVE.to_ruby, mode: "wb")
|
90
|
+
end
|
91
|
+
|
92
|
+
SOFILES_SET.each do |(soname, ruby)|
|
93
|
+
sodir = File.dirname(soname)
|
94
|
+
makefile = File.join(sodir, "Makefile")
|
95
|
+
|
96
|
+
CLEAN << GEMSPEC_NATIVE << sodir
|
97
|
+
CLOBBER << GEMFILE_NATIVE
|
98
|
+
|
99
|
+
directory sodir
|
100
|
+
|
101
|
+
desc "generate Makefile for binary extension library"
|
102
|
+
file makefile => [sodir] + EXTCONF do
|
103
|
+
cd sodir do
|
104
|
+
sh "#{ruby} ../../#{EXTCONF[0]} \"--ruby=#{ruby}\""
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
desc "build binary extension library"
|
109
|
+
file soname => [makefile] + EXT do
|
110
|
+
cd sodir do
|
111
|
+
sh "make"
|
112
|
+
end
|
113
|
+
end
|
114
|
+
end
|
115
|
+
end
|
116
|
+
end
|
117
|
+
|
118
|
+
|
119
|
+
task :all => GEMFILE
|
120
|
+
|
121
|
+
desc "generate local rdoc"
|
122
|
+
task :rdoc => DOC + EXT + LIB do
|
123
|
+
sh *(%w(rdoc) + GEMSTUB.rdoc_options + DOC + EXT + LIB)
|
124
|
+
end
|
125
|
+
|
126
|
+
desc "launch rspec"
|
127
|
+
task rspec: :all do
|
128
|
+
sh "rspec"
|
129
|
+
end
|
130
|
+
|
131
|
+
desc "build gem package"
|
132
|
+
task gem: GEMFILE
|
133
|
+
|
134
|
+
desc "generate gemspec"
|
135
|
+
task gemspec: GEMSPEC
|
136
|
+
|
137
|
+
file GEMFILE => DOC + EXT + EXTCONF + BIN + LIB + SPEC + EXAMPLE + RAKEFILE + [GEMSPEC] do
|
138
|
+
sh "gem build #{GEMSPEC}"
|
139
|
+
end
|
140
|
+
|
141
|
+
file GEMSPEC => RAKEFILE do
|
142
|
+
File.write(GEMSPEC, GEMSTUB.to_ruby, mode: "wb")
|
143
|
+
end
|
@@ -0,0 +1,22 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2009-2015 Richard Geldreich, Jr. <richgel99@gmail.com>
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
13
|
+
copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21
|
+
SOFTWARE.
|
22
|
+
|
@@ -0,0 +1,209 @@
|
|
1
|
+
LZHAM - Lossless Data Compression Codec
|
2
|
+
=============
|
3
|
+
|
4
|
+
<p>Copyright (c) 2009-2015 Richard Geldreich, Jr. - richgel99@gmail.com - MIT License</p>
|
5
|
+
|
6
|
+
<p>LZHAM is a lossless data compression codec written in C/C++ (specifically C++03), with a compression ratio similar to LZMA but with 1.5x-8x faster decompression speed. It officially supports Linux x86/x64, Windows x86/x64,
|
7
|
+
OSX, and iOS, with Android support on the way.</p>
|
8
|
+
|
9
|
+
<p>Some slightly out of date API documentation is here (I'll be migrating this to github): https://code.google.com/p/lzham/wiki/API_Docs</p>
|
10
|
+
|
11
|
+
<h3>Introduction</h3>
|
12
|
+
|
13
|
+
<p>LZHAM is a lossless (LZ based) data compression codec optimized for particularly fast decompression at very high compression ratios with a zlib compatible API.
|
14
|
+
It's been developed over a period of 3 years and alpha versions have already shipped in many products. (The alpha is here: https://code.google.com/p/lzham/)
|
15
|
+
LZHAM's decompressor is slower than zlib's, but generally much faster than LZMA's, with a compression ratio that is typically within a few percent of LZMA's and sometimes better.</p>
|
16
|
+
|
17
|
+
<p>LZHAM's compressor is intended for offline use, but it is tested alongside the decompressor on mobile devices and is usable on the faster settings.</p>
|
18
|
+
|
19
|
+
<p>LZHAM's decompressor currently has a higher cost to initialize than LZMA, so the threshold where LZHAM is typically faster vs. LZMA decompression is between 1000-13,000 of
|
20
|
+
*compressed* output bytes, depending on the platform. It is not a good small block compressor: it likes large (10KB-15KB minimum) blocks.</p>
|
21
|
+
|
22
|
+
<p>LZHAM has simple support for patch files (delta compression), but this is a side benefit of its design, not its primary use case. Internally it supports LZ matches up
|
23
|
+
to ~64KB and very large dictionaries (up to .5 GB).</p>
|
24
|
+
|
25
|
+
<p>LZHAM may be valuable to you if you compress data offline and distribute it to many customers, care about read/download times, and decompression speed/low CPU+power use
|
26
|
+
are important to you.</p>
|
27
|
+
|
28
|
+
<p>I've been profiling LZHAM vs. LZMA and publishing the results on my blog: http://richg42.blogspot.com</p>
|
29
|
+
|
30
|
+
<p>Some independent benchmarks of the previous alpha versions: http://heartofcomp.altervista.org/MOC/MOCADE.htm, http://mattmahoney.net/dc/text.html</p>
|
31
|
+
|
32
|
+
<p>LZHAM has been integrated into the 7zip archiver (command line and GUI) as a custom codec plugin: http://richg42.blogspot.com/2015/02/lzham-10-integrated-into-7zip-command.html</p>
|
33
|
+
|
34
|
+
<h3>10GB Benchmark Results</h3>
|
35
|
+
|
36
|
+
Results with [7zip-LZHAM 9.38 32-bit](http://richg42.blogspot.com/2015/02/7zip-938-custom-codec-plugin-for-lzham.html) (64MB dictionary) on [Matt Mahoney's 10GB benchmark](http://mattmahoney.net/dc/10gb.html):
|
37
|
+
|
38
|
+
```
|
39
|
+
LZHAM (-mx=8): 3,577,047,629 Archive Test Time: 70.652 secs
|
40
|
+
LZHAM (-mx=9): 3,573,782,721 Archive Test Time: 71.292 secs
|
41
|
+
LZMA (-mx=9): 3,560,052,414 Archive Test Time: 223.050 secs
|
42
|
+
7z .ZIP : 4,681,291,655 Archive Test Time: 73.304 secs (unzip v6 x64 test time: 61.074 secs)
|
43
|
+
```
|
44
|
+
|
45
|
+
<h3>Most Common Question: So how does it compare to other libs like LZ4?</h3>
|
46
|
+
|
47
|
+
There is no single compression algorithm that perfectly suites all use cases and practical constraints. LZ4 and LZHAM are tools which lie at completely opposite ends of the spectrum:
|
48
|
+
|
49
|
+
* LZ4: A symmetrical codec with very fast compression and decompression but very low ratios. Its compression ratio is typically less than even zlib's (which uses a 21+ year old algorithm).
|
50
|
+
LZ4 does a good job of trading off a large amount of compression ratio for very fast overall throughput.
|
51
|
+
Usage example: Reading LZMA/LZHAM/etc. compressed data from the network and decompressing it, then caching this data locally on disk using LZ4 to reduce disk usage and decrease future loading times.
|
52
|
+
|
53
|
+
* LZHAM: A very asymmetrical codec with slow compression speed, but with a very competitive (LZMA-like) compression ratio and reasonably fast decompression speeds (slower than zlib, but faster than LZMA).
|
54
|
+
LZHAM trades off a lot of compression throughput for very high ratios and higher decompression throughput relative to other codecs in its ratio class (which is LZMA, which runs circles around LZ4's ratio).
|
55
|
+
Usage example: Compress your product's data once on a build server, distribute it to end users over a slow media like the internet, then decompress it on the end user's device.
|
56
|
+
|
57
|
+
<h3>How Much Memory Does It Need?</h3>
|
58
|
+
|
59
|
+
For decompression it's easy to compute:
|
60
|
+
* Buffered mode: decomp_mem = dict_size + ~34KB for work tables
|
61
|
+
* Unbuffered mode: decomp_mem = ~34KB
|
62
|
+
|
63
|
+
I'll be honest here, the compressor is currently an angry beast when it comes to memory. The amount needed depends mostly on the compression level and dict. size. It's *approximately* (max_probes=128 at level -m4):
|
64
|
+
comp_mem = min(512 * 1024, dict_size / 8) * max_probes * 6 + dict_size * 9 + 22020096
|
65
|
+
|
66
|
+
Compression mem usage examples from Windows lzhamtest_x64 (note the equation is pretty off for small dictionary sizes):
|
67
|
+
* 32KB: 11MB
|
68
|
+
* 128KB: 21MB
|
69
|
+
* 512KB: 63MB
|
70
|
+
* 1MB: 118MB
|
71
|
+
* 8MB: 478MB
|
72
|
+
* 64MB: 982MB
|
73
|
+
* 128MB: 1558MB
|
74
|
+
* 256MB: 2710MB
|
75
|
+
* 512MB: 5014MB
|
76
|
+
|
77
|
+
<h3>Compressed Bitstream Compatibility</h3>
|
78
|
+
|
79
|
+
<p>v1.0's bitstream format is now locked in place, so any future v1.x releases will be backwards/forward compatible with compressed files
|
80
|
+
written with v1.0. The only thing that could change this are critical bugfixes.</p>
|
81
|
+
|
82
|
+
<p>Note LZHAM v1.x bitstreams are NOT backwards compatible with any of the previous alpha versions on Google Code.</p>
|
83
|
+
|
84
|
+
<h3>Platforms/Compiler Support</h3>
|
85
|
+
|
86
|
+
LZHAM currently officially supports x86/x64 Linux, iOS, OSX, and Windows x86/x64. At one time the codec compiled and ran fine on Xbox 360 (PPC, big endian). Android support is coming next.
|
87
|
+
It should be easy to retarget by modifying the macros in lzham_core.h.</p>
|
88
|
+
|
89
|
+
<p>LZHAM has optional support for multithreaded compression. It supports gcc built-ins or MSVC intrinsics for atomic ops. For threading, it supports OSX
|
90
|
+
specific Pthreads, generic Pthreads, or Windows API's.</p>
|
91
|
+
|
92
|
+
<p>For compilers, I've tested with gcc, clang, and MSVC 2008, 2010, and 2013. In previous alphas I also compiled with TDM-GCC x64.</p>
|
93
|
+
|
94
|
+
<h3>API</h3>
|
95
|
+
|
96
|
+
LZHAM supports streaming or memory to memory compression/decompression. See include/lzham.h. LZHAM can be linked statically or dynamically, just study the
|
97
|
+
headers and the lzhamtest project.
|
98
|
+
On Linux/OSX, it's only been tested with static linking so far.
|
99
|
+
|
100
|
+
LZHAM also supports a usable subset of the zlib API with extensions, either include/zlib.h or #define LZHAM_DEFINE_ZLIB_API and use include/lzham.h.
|
101
|
+
|
102
|
+
<h3>Usage Tips</h3>
|
103
|
+
|
104
|
+
* Always try to use the smallest dictionary size that makes sense for the file or block you are compressing, i.e. don't use a 128MB dictionary for a 15KB file. The codec
|
105
|
+
doesn't automatically choose for you because in streaming scenarios it has no idea how large the file or block will be.
|
106
|
+
* The larger the dictionary, the more RAM is required during compression and decompression. I would avoid using more than 8-16MB dictionaries on iOS.
|
107
|
+
* For faster decompression, prefer "unbuffered" decompression mode vs. buffered decompression (avoids a dictionary alloc and extra memcpy()'s), and disable adler-32 checking. Also, use the built-in LZHAM API's, not the
|
108
|
+
zlib-style API's for fastest decompression.
|
109
|
+
* Experiment with the "m_table_update_rate" compression/decompression parameter. This setting trades off a small amount of ratio for faster decompression.
|
110
|
+
Note the m_table_update_rate decompression parameter MUST match the setting used during compression (same for the dictionary size). It's up to you to store this info somehow.
|
111
|
+
* Avoid using LZHAM on small *compressed* blocks, where small is 1KB-10KB compressed bytes depending on the platform. LZHAM's decompressor is only faster than LZMA's beyond the small block threshold.
|
112
|
+
Optimizing LZHAM's decompressor to reduce its startup time relative to LZMA is a high priority.
|
113
|
+
* For best compression (I've seen up to ~4% better), enable the compressor's "extreme" parser, which is much slower but finds cheaper paths through a much denser parse graph.
|
114
|
+
Note the extreme parser can greatly slow down on files containing large amounts of repeated data/strings, but it is guaranteed to finish.
|
115
|
+
* The compressor's m_level parameter can make a big impact on compression speed. Level 0 (LZHAM_COMP_LEVEL_FASTEST) uses a much simpler greedy parser, and the other levels use
|
116
|
+
near-optimal parsing with different heuristic settings.
|
117
|
+
* Check out the compressor/decompressor reinit() API's, which are useful if you'll be compressing or decompressing many times. Using the reinit() API's is a lot cheaper than fully
|
118
|
+
initializing/deinitializing the entire codec every time.
|
119
|
+
* LZHAM's compressor is no speed demon. It's usually slower than LZMA's, sometimes by a wide (~2x slower or so) margin. In "extreme" parsing mode, it can be many times slower.
|
120
|
+
This codec was designed with offline compression in mind.
|
121
|
+
* One significant difference between LZMA and LZHAM is how uncompressible files are handled. LZMA usually expands uncompressible files, and its decompressor can bog down and run extremely
|
122
|
+
slowly on uncompressible data. LZHAM internally detects when each 512KB block is uncompressible and stores these blocks as uncompressed bytes instead.
|
123
|
+
LZHAM's literal decoding is significantly faster than LZMA's, so the more plain literals in the output stream, the faster LZHAM's decompressor runs vs. LZMA's.
|
124
|
+
* General advice (applies to LZMA and other codecs too): If you are compressing large amounts of serialized game assets, sort the serialized data by asset type and compress the whole thing as a single large "solid" block of data.
|
125
|
+
Don't compress each individual asset, this will kill your ratio and have a higher decompression startup cost. If you need random access, consider compressing the assets lumped
|
126
|
+
together into groups of a few hundred kilobytes (or whatever) each.
|
127
|
+
* LZHAM is a raw codec. It doesn't include any sort of preprocessing: EXE rel to abs jump transformation, audio predictors, etc. That's up to you
|
128
|
+
to do, before compression.
|
129
|
+
|
130
|
+
<h3>Codec Test App</h3>
|
131
|
+
|
132
|
+
lzhamtest_x86/x64 is a simple command line test program that uses the LZHAM codec to compress/decompress single files.
|
133
|
+
lzhamtest is not intended as a file archiver or end user tool, it's just a simple testbed.
|
134
|
+
|
135
|
+
-- Usage examples:
|
136
|
+
|
137
|
+
- Compress single file "source_filename" to "compressed_filename":
|
138
|
+
lzhamtest_x64 c source_filename compressed_filename
|
139
|
+
|
140
|
+
- Decompress single file "compressed_filename" to "decompressed_filename":
|
141
|
+
lzhamtest_x64 d compressed_filename decompressed_filename
|
142
|
+
|
143
|
+
- Compress single file "source_filename" to "compressed_filename", then verify the compressed file decompresses properly to the source file:
|
144
|
+
lzhamtest_x64 -v c source_filename compressed_filename
|
145
|
+
|
146
|
+
- Recursively compress all files under specified directory and verify that each file decompresses properly:
|
147
|
+
lzhamtest_x64 -v a c:\source_path
|
148
|
+
|
149
|
+
-- Options
|
150
|
+
|
151
|
+
- Set dictionary size used during compressed to 1MB (2^20):
|
152
|
+
lzhamtest_x64 -d20 c source_filename compressed_filename
|
153
|
+
|
154
|
+
Valid dictionary sizes are [15,26] for x86, and [15,29] for x64. (See LZHAM_MIN_DICT_SIZE_LOG2, etc. defines in include/lzham.h.)
|
155
|
+
The x86 version defaults to 64MB (26), and the x64 version defaults to 256MB (28). I wouldn't recommend setting the dictionary size to
|
156
|
+
512MB unless your machine has more than 4GB of physical memory.
|
157
|
+
|
158
|
+
- Set compression level to fastest:
|
159
|
+
lzhamtest_x64 -m0 c source_filename compressed_filename
|
160
|
+
|
161
|
+
- Set compression level to uber (the default):
|
162
|
+
lzhamtest_x64 -m4 c source_filename compressed_filename
|
163
|
+
|
164
|
+
- For best possible compression, use -d29 to enable the largest dictionary size (512MB) and the -x option which enables more rigorous (but ~4X slower!) parsing:
|
165
|
+
lzhamtest_x64 -d29 -x -m4 c source_filename compressed_filename
|
166
|
+
|
167
|
+
See lzhamtest_x86/x64.exe's help text for more command line parameters.
|
168
|
+
|
169
|
+
<h3>Compiling LZHAM</h3>
|
170
|
+
|
171
|
+
- Linux: Use "cmake ." then "make". The cmake script only supports Linux at the moment. (Sorry, working on build systems is a drag.)
|
172
|
+
- OSX/iOS: Use the included XCode project. (NOTE: I haven't merged this over yet. It's coming!)
|
173
|
+
- Windows: Use the included VS 2010 project
|
174
|
+
|
175
|
+
IMPORTANT: With clang or gcc compile LZHAM with "No strict aliasing" ENABLED: -fno-strict-aliasing
|
176
|
+
|
177
|
+
I DO NOT test or develop the codec with strict aliasing:
|
178
|
+
* https://lkml.org/lkml/2003/2/26/158
|
179
|
+
* http://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-horror-stories
|
180
|
+
|
181
|
+
It might work fine, I don't know yet. This is usually not a problem with MSVC, which defaults to strict aliasing being off.
|
182
|
+
|
183
|
+
<h3>ANSI C/C++</h3>
|
184
|
+
|
185
|
+
LZHAM supports compiling as plain vanilla ANSI C/C++. To see how the codec configures itself check out lzham_core.h and search for "LZHAM_ANSI_CPLUSPLUS".
|
186
|
+
All platform specific stuff (unaligned loads, threading, atomic ops, etc.) should be disabled when this macro is defined. Note, the compressor doesn't use threads
|
187
|
+
or atomic operations when built this way so it's going to be pretty slow. (The compressor was built from the ground up to be threaded.)
|
188
|
+
|
189
|
+
<h3>Known Problems</h3>
|
190
|
+
|
191
|
+
<p>LZHAM's decompressor is like a drag racer that needs time to get up to speed. LZHAM is not intended or optimized to be used on "small" blocks of data (less
|
192
|
+
than ~10,000 bytes of *compressed* data on desktops, or around 1,000-5,000 on iOS). If your usage case involves calling the codec over and over with tiny blocks
|
193
|
+
then LZMA, LZ4, Deflate, etc. are probably better choices.</p>
|
194
|
+
|
195
|
+
<p>The decompressor still takes too long to init vs. LZMA. On iOS the cost is not that bad, but on desktop the cost is high. I have reduced the startup cost vs. the
|
196
|
+
alpha but there's still work to do.</p>
|
197
|
+
|
198
|
+
<p>The compressor is slower than I would like, and doesn't scale as well as it could. I added a reinit() method to make it initialize faster, but it's not a speed demon.
|
199
|
+
My focus has been on ratio and decompression speed.</p>
|
200
|
+
|
201
|
+
<p>I use tabs=3 spaces, but I think some actual tabs got in the code. I need to run the sources through ClangFormat or whatever.</p>
|
202
|
+
|
203
|
+
<h3>Special Thanks</h3>
|
204
|
+
|
205
|
+
<p>Thanks to everyone at the http://encode.ru forums. I read these forums as a lurker before working on LZHAM, and I studied every LZ related
|
206
|
+
post I could get my hands on. Especially anything related to LZ optimal parsing, which still seems like a black art. LZHAM was my way of
|
207
|
+
learning how to implement optimal parsing (and you can see this if you study the progress I made in the early alphas on Google Code).</p>
|
208
|
+
|
209
|
+
<p>Also, thanks to Igor Pavlov, the original creator of LZMA and 7zip, for advancing the start of the art in LZ compression.</p>
|