bio-velvet_underground 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.document +5 -0
- data/.gitmodules +3 -0
- data/.travis.yml +13 -0
- data/Gemfile +19 -0
- data/LICENSE.txt +20 -0
- data/README.md +53 -0
- data/Rakefile +51 -0
- data/VERSION +1 -0
- data/ext/bioruby.patch +60 -0
- data/ext/mkrf_conf.rb +50 -0
- data/ext/src/Makefile +125 -0
- data/ext/src/src/allocArray.c +305 -0
- data/ext/src/src/allocArray.h +86 -0
- data/ext/src/src/autoOpen.c +107 -0
- data/ext/src/src/autoOpen.h +18 -0
- data/ext/src/src/binarySequences.c +813 -0
- data/ext/src/src/binarySequences.h +125 -0
- data/ext/src/src/concatenatedGraph.c +233 -0
- data/ext/src/src/concatenatedGraph.h +30 -0
- data/ext/src/src/concatenatedPreGraph.c +262 -0
- data/ext/src/src/concatenatedPreGraph.h +29 -0
- data/ext/src/src/correctedGraph.c +2642 -0
- data/ext/src/src/correctedGraph.h +32 -0
- data/ext/src/src/dfib.c +509 -0
- data/ext/src/src/dfib.h +69 -0
- data/ext/src/src/dfibHeap.c +89 -0
- data/ext/src/src/dfibHeap.h +39 -0
- data/ext/src/src/dfibpriv.h +105 -0
- data/ext/src/src/fib.c +628 -0
- data/ext/src/src/fib.h +78 -0
- data/ext/src/src/fibHeap.c +79 -0
- data/ext/src/src/fibHeap.h +41 -0
- data/ext/src/src/fibpriv.h +110 -0
- data/ext/src/src/globals.h +153 -0
- data/ext/src/src/graph.c +3983 -0
- data/ext/src/src/graph.h +233 -0
- data/ext/src/src/graphReConstruction.c +1472 -0
- data/ext/src/src/graphReConstruction.h +30 -0
- data/ext/src/src/graphStats.c +2167 -0
- data/ext/src/src/graphStats.h +72 -0
- data/ext/src/src/kmer.c +652 -0
- data/ext/src/src/kmer.h +73 -0
- data/ext/src/src/kmerOccurenceTable.c +236 -0
- data/ext/src/src/kmerOccurenceTable.h +44 -0
- data/ext/src/src/kseq.h +223 -0
- data/ext/src/src/locallyCorrectedGraph.c +557 -0
- data/ext/src/src/locallyCorrectedGraph.h +40 -0
- data/ext/src/src/passageMarker.c +677 -0
- data/ext/src/src/passageMarker.h +137 -0
- data/ext/src/src/preGraph.c +1717 -0
- data/ext/src/src/preGraph.h +106 -0
- data/ext/src/src/preGraphConstruction.c +990 -0
- data/ext/src/src/preGraphConstruction.h +26 -0
- data/ext/src/src/readCoherentGraph.c +557 -0
- data/ext/src/src/readCoherentGraph.h +30 -0
- data/ext/src/src/readSet.c +1734 -0
- data/ext/src/src/readSet.h +67 -0
- data/ext/src/src/recycleBin.c +199 -0
- data/ext/src/src/recycleBin.h +58 -0
- data/ext/src/src/roadMap.c +342 -0
- data/ext/src/src/roadMap.h +65 -0
- data/ext/src/src/run.c +318 -0
- data/ext/src/src/run.h +52 -0
- data/ext/src/src/run2.c +712 -0
- data/ext/src/src/scaffold.c +1876 -0
- data/ext/src/src/scaffold.h +64 -0
- data/ext/src/src/shortReadPairs.c +1243 -0
- data/ext/src/src/shortReadPairs.h +32 -0
- data/ext/src/src/splay.c +259 -0
- data/ext/src/src/splay.h +43 -0
- data/ext/src/src/splayTable.c +1315 -0
- data/ext/src/src/splayTable.h +31 -0
- data/ext/src/src/tightString.c +362 -0
- data/ext/src/src/tightString.h +82 -0
- data/ext/src/src/utility.c +199 -0
- data/ext/src/src/utility.h +98 -0
- data/ext/src/third-party/zlib-1.2.3/ChangeLog +855 -0
- data/ext/src/third-party/zlib-1.2.3/FAQ +339 -0
- data/ext/src/third-party/zlib-1.2.3/INDEX +51 -0
- data/ext/src/third-party/zlib-1.2.3/Makefile +154 -0
- data/ext/src/third-party/zlib-1.2.3/Makefile.in +154 -0
- data/ext/src/third-party/zlib-1.2.3/README +125 -0
- data/ext/src/third-party/zlib-1.2.3/adler32.c +149 -0
- data/ext/src/third-party/zlib-1.2.3/algorithm.txt +209 -0
- data/ext/src/third-party/zlib-1.2.3/amiga/Makefile.pup +66 -0
- data/ext/src/third-party/zlib-1.2.3/amiga/Makefile.sas +65 -0
- data/ext/src/third-party/zlib-1.2.3/as400/bndsrc +132 -0
- data/ext/src/third-party/zlib-1.2.3/as400/compile.clp +123 -0
- data/ext/src/third-party/zlib-1.2.3/as400/readme.txt +111 -0
- data/ext/src/third-party/zlib-1.2.3/as400/zlib.inc +331 -0
- data/ext/src/third-party/zlib-1.2.3/compress.c +79 -0
- data/ext/src/third-party/zlib-1.2.3/configure +459 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/README.contrib +71 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/buffer_demo.adb +106 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/mtest.adb +156 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/read.adb +156 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/readme.txt +65 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/test.adb +463 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib-streams.adb +225 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib-streams.ads +114 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib-thin.adb +141 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib-thin.ads +450 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib.adb +701 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib.ads +328 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/ada/zlib.gpr +20 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/asm586/README.586 +43 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/asm586/match.S +364 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/asm686/README.686 +34 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/asm686/match.S +329 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/Makefile +8 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/README +4 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/blast.c +444 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/blast.h +71 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/test.pk +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/blast/test.txt +1 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/delphi/ZLib.pas +557 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/delphi/ZLibConst.pas +11 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/delphi/readme.txt +76 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/delphi/zlibd32.mak +93 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib.build +33 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib.chm +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib.sln +21 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/AssemblyInfo.cs +58 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/ChecksumImpl.cs +202 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/CircularBuffer.cs +83 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/CodecBase.cs +198 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/Deflater.cs +106 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/DotZLib.cs +288 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/DotZLib.csproj +141 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/GZipStream.cs +301 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/Inflater.cs +105 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/DotZLib/UnitTests.cs +274 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/LICENSE_1_0.txt +23 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/dotzlib/readme.txt +58 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/README +1 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/infback9.c +608 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/infback9.h +37 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/inffix9.h +107 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/inflate9.h +47 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/inftree9.c +323 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/infback9/inftree9.h +55 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/inflate86/inffas86.c +1157 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/inflate86/inffast.S +1368 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream/test.cpp +24 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream/zfstream.cpp +329 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream/zfstream.h +128 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream2/zstream.h +307 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream2/zstream_test.cpp +25 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream3/README +35 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream3/TODO +17 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream3/test.cc +50 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream3/zfstream.cc +479 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/iostream3/zfstream.h +466 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masm686/match.asm +413 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/bld_ml64.bat +2 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/gvmat64.asm +513 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/gvmat64.obj +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/inffas8664.c +186 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/inffasx64.asm +392 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/inffasx64.obj +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx64/readme.txt +28 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/bld_ml32.bat +2 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/gvmat32.asm +972 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/gvmat32.obj +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/gvmat32c.c +62 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/inffas32.asm +1083 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/inffas32.obj +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/mkasm.bat +3 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/masmx86/readme.txt +21 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/ChangeLogUnzip +67 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/Makefile +25 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/crypt.h +132 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/ioapi.c +177 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/ioapi.h +75 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/iowin32.c +270 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/iowin32.h +21 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/miniunz.c +585 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/minizip.c +420 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/mztools.c +281 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/mztools.h +31 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/unzip.c +1598 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/unzip.h +354 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/zip.c +1219 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/minizip/zip.h +235 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/pascal/example.pas +599 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/pascal/readme.txt +76 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/pascal/zlibd32.mak +93 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/pascal/zlibpas.pas +236 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/puff/Makefile +8 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/puff/README +63 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/puff/puff.c +837 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/puff/puff.h +31 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/puff/zeros.raw +0 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/testzlib/testzlib.c +275 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/testzlib/testzlib.txt +10 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/untgz/Makefile +14 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/untgz/Makefile.msc +17 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/untgz/untgz.c +674 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/readme.txt +73 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/miniunz.vcproj +126 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/minizip.vcproj +126 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/testzlib.vcproj +126 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/zlib.rc +32 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/zlibstat.vcproj +246 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/zlibvc.def +92 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/zlibvc.sln +78 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc7/zlibvc.vcproj +445 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/miniunz.vcproj +566 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/minizip.vcproj +563 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/testzlib.vcproj +948 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/testzlibdll.vcproj +567 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/zlib.rc +32 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/zlibstat.vcproj +870 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/zlibvc.def +92 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/zlibvc.sln +144 -0
- data/ext/src/third-party/zlib-1.2.3/contrib/vstudio/vc8/zlibvc.vcproj +1219 -0
- data/ext/src/third-party/zlib-1.2.3/crc32.c +423 -0
- data/ext/src/third-party/zlib-1.2.3/crc32.h +441 -0
- data/ext/src/third-party/zlib-1.2.3/deflate.c +1736 -0
- data/ext/src/third-party/zlib-1.2.3/deflate.h +331 -0
- data/ext/src/third-party/zlib-1.2.3/example.c +565 -0
- data/ext/src/third-party/zlib-1.2.3/examples/README.examples +42 -0
- data/ext/src/third-party/zlib-1.2.3/examples/fitblk.c +233 -0
- data/ext/src/third-party/zlib-1.2.3/examples/gun.c +693 -0
- data/ext/src/third-party/zlib-1.2.3/examples/gzappend.c +500 -0
- data/ext/src/third-party/zlib-1.2.3/examples/gzjoin.c +448 -0
- data/ext/src/third-party/zlib-1.2.3/examples/gzlog.c +413 -0
- data/ext/src/third-party/zlib-1.2.3/examples/gzlog.h +58 -0
- data/ext/src/third-party/zlib-1.2.3/examples/zlib_how.html +523 -0
- data/ext/src/third-party/zlib-1.2.3/examples/zpipe.c +191 -0
- data/ext/src/third-party/zlib-1.2.3/examples/zran.c +404 -0
- data/ext/src/third-party/zlib-1.2.3/gzio.c +1026 -0
- data/ext/src/third-party/zlib-1.2.3/infback.c +623 -0
- data/ext/src/third-party/zlib-1.2.3/inffast.c +318 -0
- data/ext/src/third-party/zlib-1.2.3/inffast.h +11 -0
- data/ext/src/third-party/zlib-1.2.3/inffixed.h +94 -0
- data/ext/src/third-party/zlib-1.2.3/inflate.c +1368 -0
- data/ext/src/third-party/zlib-1.2.3/inflate.h +115 -0
- data/ext/src/third-party/zlib-1.2.3/inftrees.c +329 -0
- data/ext/src/third-party/zlib-1.2.3/inftrees.h +55 -0
- data/ext/src/third-party/zlib-1.2.3/make_vms.com +461 -0
- data/ext/src/third-party/zlib-1.2.3/minigzip.c +322 -0
- data/ext/src/third-party/zlib-1.2.3/msdos/Makefile.bor +109 -0
- data/ext/src/third-party/zlib-1.2.3/msdos/Makefile.dj2 +104 -0
- data/ext/src/third-party/zlib-1.2.3/msdos/Makefile.emx +69 -0
- data/ext/src/third-party/zlib-1.2.3/msdos/Makefile.msc +106 -0
- data/ext/src/third-party/zlib-1.2.3/msdos/Makefile.tc +94 -0
- data/ext/src/third-party/zlib-1.2.3/old/Makefile.riscos +151 -0
- data/ext/src/third-party/zlib-1.2.3/old/README +3 -0
- data/ext/src/third-party/zlib-1.2.3/old/descrip.mms +48 -0
- data/ext/src/third-party/zlib-1.2.3/old/os2/Makefile.os2 +136 -0
- data/ext/src/third-party/zlib-1.2.3/old/os2/zlib.def +51 -0
- data/ext/src/third-party/zlib-1.2.3/old/visual-basic.txt +160 -0
- data/ext/src/third-party/zlib-1.2.3/old/zlib.html +971 -0
- data/ext/src/third-party/zlib-1.2.3/projects/README.projects +41 -0
- data/ext/src/third-party/zlib-1.2.3/projects/visualc6/README.txt +73 -0
- data/ext/src/third-party/zlib-1.2.3/projects/visualc6/example.dsp +278 -0
- data/ext/src/third-party/zlib-1.2.3/projects/visualc6/minigzip.dsp +278 -0
- data/ext/src/third-party/zlib-1.2.3/projects/visualc6/zlib.dsp +609 -0
- data/ext/src/third-party/zlib-1.2.3/projects/visualc6/zlib.dsw +59 -0
- data/ext/src/third-party/zlib-1.2.3/qnx/package.qpg +141 -0
- data/ext/src/third-party/zlib-1.2.3/trees.c +1219 -0
- data/ext/src/third-party/zlib-1.2.3/trees.h +128 -0
- data/ext/src/third-party/zlib-1.2.3/uncompr.c +61 -0
- data/ext/src/third-party/zlib-1.2.3/win32/DLL_FAQ.txt +397 -0
- data/ext/src/third-party/zlib-1.2.3/win32/Makefile.bor +107 -0
- data/ext/src/third-party/zlib-1.2.3/win32/Makefile.emx +69 -0
- data/ext/src/third-party/zlib-1.2.3/win32/Makefile.gcc +141 -0
- data/ext/src/third-party/zlib-1.2.3/win32/Makefile.msc +126 -0
- data/ext/src/third-party/zlib-1.2.3/win32/VisualC.txt +3 -0
- data/ext/src/third-party/zlib-1.2.3/win32/zlib.def +60 -0
- data/ext/src/third-party/zlib-1.2.3/win32/zlib1.rc +39 -0
- data/ext/src/third-party/zlib-1.2.3/zconf.h +332 -0
- data/ext/src/third-party/zlib-1.2.3/zconf.in.h +332 -0
- data/ext/src/third-party/zlib-1.2.3/zlib.3 +159 -0
- data/ext/src/third-party/zlib-1.2.3/zlib.h +1357 -0
- data/ext/src/third-party/zlib-1.2.3/zutil.c +318 -0
- data/ext/src/third-party/zlib-1.2.3/zutil.h +269 -0
- data/lib/bio-velvet_underground.rb +12 -0
- data/lib/bio-velvet_underground/external/VERSION +1 -0
- data/lib/bio-velvet_underground/velvet_underground.rb +72 -0
- data/spec/binary_sequence_store_spec.rb +27 -0
- data/spec/data/1/CnyUnifiedSeq +0 -0
- data/spec/spec_helper.rb +31 -0
- metadata +456 -0
@@ -0,0 +1,154 @@
|
|
1
|
+
# Makefile for zlib
|
2
|
+
# Copyright (C) 1995-2005 Jean-loup Gailly.
|
3
|
+
# For conditions of distribution and use, see copyright notice in zlib.h
|
4
|
+
|
5
|
+
# To compile and test, type:
|
6
|
+
# ./configure; make test
|
7
|
+
# The call of configure is optional if you don't have special requirements
|
8
|
+
# If you wish to build zlib as a shared library, use: ./configure -s
|
9
|
+
|
10
|
+
# To use the asm code, type:
|
11
|
+
# cp contrib/asm?86/match.S ./match.S
|
12
|
+
# make LOC=-DASMV OBJA=match.o
|
13
|
+
|
14
|
+
# To install /usr/local/lib/libz.* and /usr/local/include/zlib.h, type:
|
15
|
+
# make install
|
16
|
+
# To install in $HOME instead of /usr/local, use:
|
17
|
+
# make install prefix=$HOME
|
18
|
+
|
19
|
+
CC=cc
|
20
|
+
|
21
|
+
CFLAGS=-O
|
22
|
+
#CFLAGS=-O -DMAX_WBITS=14 -DMAX_MEM_LEVEL=7
|
23
|
+
#CFLAGS=-g -DDEBUG
|
24
|
+
#CFLAGS=-O3 -Wall -Wwrite-strings -Wpointer-arith -Wconversion \
|
25
|
+
# -Wstrict-prototypes -Wmissing-prototypes
|
26
|
+
|
27
|
+
LDFLAGS=libz.a
|
28
|
+
LDSHARED=$(CC)
|
29
|
+
CPP=$(CC) -E
|
30
|
+
|
31
|
+
LIBS=libz.a
|
32
|
+
SHAREDLIB=libz.so
|
33
|
+
SHAREDLIBV=libz.so.1.2.3
|
34
|
+
SHAREDLIBM=libz.so.1
|
35
|
+
|
36
|
+
AR=ar rc
|
37
|
+
RANLIB=ranlib
|
38
|
+
TAR=tar
|
39
|
+
SHELL=/bin/sh
|
40
|
+
EXE=
|
41
|
+
|
42
|
+
prefix = /usr/local
|
43
|
+
exec_prefix = ${prefix}
|
44
|
+
libdir = ${exec_prefix}/lib
|
45
|
+
includedir = ${prefix}/include
|
46
|
+
mandir = ${prefix}/share/man
|
47
|
+
man3dir = ${mandir}/man3
|
48
|
+
|
49
|
+
OBJS = adler32.o compress.o crc32.o gzio.o uncompr.o deflate.o trees.o \
|
50
|
+
zutil.o inflate.o infback.o inftrees.o inffast.o
|
51
|
+
|
52
|
+
OBJA =
|
53
|
+
# to use the asm code: make OBJA=match.o
|
54
|
+
|
55
|
+
TEST_OBJS = example.o minigzip.o
|
56
|
+
|
57
|
+
all: example$(EXE) minigzip$(EXE)
|
58
|
+
|
59
|
+
check: test
|
60
|
+
test: all
|
61
|
+
@LD_LIBRARY_PATH=.:$(LD_LIBRARY_PATH) ; export LD_LIBRARY_PATH; \
|
62
|
+
echo hello world | ./minigzip | ./minigzip -d || \
|
63
|
+
echo ' *** minigzip test FAILED ***' ; \
|
64
|
+
if ./example; then \
|
65
|
+
echo ' *** zlib test OK ***'; \
|
66
|
+
else \
|
67
|
+
echo ' *** zlib test FAILED ***'; \
|
68
|
+
fi
|
69
|
+
|
70
|
+
libz.a: $(OBJS) $(OBJA)
|
71
|
+
$(AR) $@ $(OBJS) $(OBJA)
|
72
|
+
-@ ($(RANLIB) $@ || true) >/dev/null 2>&1
|
73
|
+
|
74
|
+
match.o: match.S
|
75
|
+
$(CPP) match.S > _match.s
|
76
|
+
$(CC) -c _match.s
|
77
|
+
mv _match.o match.o
|
78
|
+
rm -f _match.s
|
79
|
+
|
80
|
+
$(SHAREDLIBV): $(OBJS)
|
81
|
+
$(LDSHARED) -o $@ $(OBJS)
|
82
|
+
rm -f $(SHAREDLIB) $(SHAREDLIBM)
|
83
|
+
ln -s $@ $(SHAREDLIB)
|
84
|
+
ln -s $@ $(SHAREDLIBM)
|
85
|
+
|
86
|
+
example$(EXE): example.o $(LIBS)
|
87
|
+
$(CC) $(CFLAGS) -o $@ example.o $(LDFLAGS)
|
88
|
+
|
89
|
+
minigzip$(EXE): minigzip.o $(LIBS)
|
90
|
+
$(CC) $(CFLAGS) -o $@ minigzip.o $(LDFLAGS)
|
91
|
+
|
92
|
+
install: $(LIBS)
|
93
|
+
-@if [ ! -d $(exec_prefix) ]; then mkdir -p $(exec_prefix); fi
|
94
|
+
-@if [ ! -d $(includedir) ]; then mkdir -p $(includedir); fi
|
95
|
+
-@if [ ! -d $(libdir) ]; then mkdir -p $(libdir); fi
|
96
|
+
-@if [ ! -d $(man3dir) ]; then mkdir -p $(man3dir); fi
|
97
|
+
cp zlib.h zconf.h $(includedir)
|
98
|
+
chmod 644 $(includedir)/zlib.h $(includedir)/zconf.h
|
99
|
+
cp $(LIBS) $(libdir)
|
100
|
+
cd $(libdir); chmod 755 $(LIBS)
|
101
|
+
-@(cd $(libdir); $(RANLIB) libz.a || true) >/dev/null 2>&1
|
102
|
+
cd $(libdir); if test -f $(SHAREDLIBV); then \
|
103
|
+
rm -f $(SHAREDLIB) $(SHAREDLIBM); \
|
104
|
+
ln -s $(SHAREDLIBV) $(SHAREDLIB); \
|
105
|
+
ln -s $(SHAREDLIBV) $(SHAREDLIBM); \
|
106
|
+
(ldconfig || true) >/dev/null 2>&1; \
|
107
|
+
fi
|
108
|
+
cp zlib.3 $(man3dir)
|
109
|
+
chmod 644 $(man3dir)/zlib.3
|
110
|
+
# The ranlib in install is needed on NeXTSTEP which checks file times
|
111
|
+
# ldconfig is for Linux
|
112
|
+
|
113
|
+
uninstall:
|
114
|
+
cd $(includedir); \
|
115
|
+
cd $(libdir); rm -f libz.a; \
|
116
|
+
if test -f $(SHAREDLIBV); then \
|
117
|
+
rm -f $(SHAREDLIBV) $(SHAREDLIB) $(SHAREDLIBM); \
|
118
|
+
fi
|
119
|
+
cd $(man3dir); rm -f zlib.3
|
120
|
+
|
121
|
+
mostlyclean: clean
|
122
|
+
clean:
|
123
|
+
rm -f *.o *~ example$(EXE) minigzip$(EXE) \
|
124
|
+
libz.* foo.gz so_locations \
|
125
|
+
_match.s maketree contrib/infback9/*.o
|
126
|
+
|
127
|
+
maintainer-clean: distclean
|
128
|
+
distclean: clean
|
129
|
+
cp -p Makefile.in Makefile
|
130
|
+
cp -p zconf.in.h zconf.h
|
131
|
+
rm -f .DS_Store
|
132
|
+
|
133
|
+
tags:
|
134
|
+
etags *.[ch]
|
135
|
+
|
136
|
+
depend:
|
137
|
+
makedepend -- $(CFLAGS) -- *.[ch]
|
138
|
+
|
139
|
+
# DO NOT DELETE THIS LINE -- make depend depends on it.
|
140
|
+
|
141
|
+
adler32.o: zlib.h zconf.h
|
142
|
+
compress.o: zlib.h zconf.h
|
143
|
+
crc32.o: crc32.h zlib.h zconf.h
|
144
|
+
deflate.o: deflate.h zutil.h zlib.h zconf.h
|
145
|
+
example.o: zlib.h zconf.h
|
146
|
+
gzio.o: zutil.h zlib.h zconf.h
|
147
|
+
inffast.o: zutil.h zlib.h zconf.h inftrees.h inflate.h inffast.h
|
148
|
+
inflate.o: zutil.h zlib.h zconf.h inftrees.h inflate.h inffast.h
|
149
|
+
infback.o: zutil.h zlib.h zconf.h inftrees.h inflate.h inffast.h
|
150
|
+
inftrees.o: zutil.h zlib.h zconf.h inftrees.h
|
151
|
+
minigzip.o: zlib.h zconf.h
|
152
|
+
trees.o: deflate.h zutil.h zlib.h zconf.h trees.h
|
153
|
+
uncompr.o: zlib.h zconf.h
|
154
|
+
zutil.o: zutil.h zlib.h zconf.h
|
@@ -0,0 +1,125 @@
|
|
1
|
+
ZLIB DATA COMPRESSION LIBRARY
|
2
|
+
|
3
|
+
zlib 1.2.3 is a general purpose data compression library. All the code is
|
4
|
+
thread safe. The data format used by the zlib library is described by RFCs
|
5
|
+
(Request for Comments) 1950 to 1952 in the files
|
6
|
+
http://www.ietf.org/rfc/rfc1950.txt (zlib format), rfc1951.txt (deflate format)
|
7
|
+
and rfc1952.txt (gzip format). These documents are also available in other
|
8
|
+
formats from ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html
|
9
|
+
|
10
|
+
All functions of the compression library are documented in the file zlib.h
|
11
|
+
(volunteer to write man pages welcome, contact zlib@gzip.org). A usage example
|
12
|
+
of the library is given in the file example.c which also tests that the library
|
13
|
+
is working correctly. Another example is given in the file minigzip.c. The
|
14
|
+
compression library itself is composed of all source files except example.c and
|
15
|
+
minigzip.c.
|
16
|
+
|
17
|
+
To compile all files and run the test program, follow the instructions given at
|
18
|
+
the top of Makefile. In short "make test; make install" should work for most
|
19
|
+
machines. For Unix: "./configure; make test; make install". For MSDOS, use one
|
20
|
+
of the special makefiles such as Makefile.msc. For VMS, use make_vms.com.
|
21
|
+
|
22
|
+
Questions about zlib should be sent to <zlib@gzip.org>, or to Gilles Vollant
|
23
|
+
<info@winimage.com> for the Windows DLL version. The zlib home page is
|
24
|
+
http://www.zlib.org or http://www.gzip.org/zlib/ Before reporting a problem,
|
25
|
+
please check this site to verify that you have the latest version of zlib;
|
26
|
+
otherwise get the latest version and check whether the problem still exists or
|
27
|
+
not.
|
28
|
+
|
29
|
+
PLEASE read the zlib FAQ http://www.gzip.org/zlib/zlib_faq.html before asking
|
30
|
+
for help.
|
31
|
+
|
32
|
+
Mark Nelson <markn@ieee.org> wrote an article about zlib for the Jan. 1997
|
33
|
+
issue of Dr. Dobb's Journal; a copy of the article is available in
|
34
|
+
http://dogma.net/markn/articles/zlibtool/zlibtool.htm
|
35
|
+
|
36
|
+
The changes made in version 1.2.3 are documented in the file ChangeLog.
|
37
|
+
|
38
|
+
Unsupported third party contributions are provided in directory "contrib".
|
39
|
+
|
40
|
+
A Java implementation of zlib is available in the Java Development Kit
|
41
|
+
http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/package-summary.html
|
42
|
+
See the zlib home page http://www.zlib.org for details.
|
43
|
+
|
44
|
+
A Perl interface to zlib written by Paul Marquess <pmqs@cpan.org> is in the
|
45
|
+
CPAN (Comprehensive Perl Archive Network) sites
|
46
|
+
http://www.cpan.org/modules/by-module/Compress/
|
47
|
+
|
48
|
+
A Python interface to zlib written by A.M. Kuchling <amk@amk.ca> is
|
49
|
+
available in Python 1.5 and later versions, see
|
50
|
+
http://www.python.org/doc/lib/module-zlib.html
|
51
|
+
|
52
|
+
A zlib binding for TCL written by Andreas Kupries <a.kupries@westend.com> is
|
53
|
+
availlable at http://www.oche.de/~akupries/soft/trf/trf_zip.html
|
54
|
+
|
55
|
+
An experimental package to read and write files in .zip format, written on top
|
56
|
+
of zlib by Gilles Vollant <info@winimage.com>, is available in the
|
57
|
+
contrib/minizip directory of zlib.
|
58
|
+
|
59
|
+
|
60
|
+
Notes for some targets:
|
61
|
+
|
62
|
+
- For Windows DLL versions, please see win32/DLL_FAQ.txt
|
63
|
+
|
64
|
+
- For 64-bit Irix, deflate.c must be compiled without any optimization. With
|
65
|
+
-O, one libpng test fails. The test works in 32 bit mode (with the -n32
|
66
|
+
compiler flag). The compiler bug has been reported to SGI.
|
67
|
+
|
68
|
+
- zlib doesn't work with gcc 2.6.3 on a DEC 3000/300LX under OSF/1 2.1 it works
|
69
|
+
when compiled with cc.
|
70
|
+
|
71
|
+
- On Digital Unix 4.0D (formely OSF/1) on AlphaServer, the cc option -std1 is
|
72
|
+
necessary to get gzprintf working correctly. This is done by configure.
|
73
|
+
|
74
|
+
- zlib doesn't work on HP-UX 9.05 with some versions of /bin/cc. It works with
|
75
|
+
other compilers. Use "make test" to check your compiler.
|
76
|
+
|
77
|
+
- gzdopen is not supported on RISCOS, BEOS and by some Mac compilers.
|
78
|
+
|
79
|
+
- For PalmOs, see http://palmzlib.sourceforge.net/
|
80
|
+
|
81
|
+
- When building a shared, i.e. dynamic library on Mac OS X, the library must be
|
82
|
+
installed before testing (do "make install" before "make test"), since the
|
83
|
+
library location is specified in the library.
|
84
|
+
|
85
|
+
|
86
|
+
Acknowledgments:
|
87
|
+
|
88
|
+
The deflate format used by zlib was defined by Phil Katz. The deflate
|
89
|
+
and zlib specifications were written by L. Peter Deutsch. Thanks to all the
|
90
|
+
people who reported problems and suggested various improvements in zlib;
|
91
|
+
they are too numerous to cite here.
|
92
|
+
|
93
|
+
Copyright notice:
|
94
|
+
|
95
|
+
(C) 1995-2004 Jean-loup Gailly and Mark Adler
|
96
|
+
|
97
|
+
This software is provided 'as-is', without any express or implied
|
98
|
+
warranty. In no event will the authors be held liable for any damages
|
99
|
+
arising from the use of this software.
|
100
|
+
|
101
|
+
Permission is granted to anyone to use this software for any purpose,
|
102
|
+
including commercial applications, and to alter it and redistribute it
|
103
|
+
freely, subject to the following restrictions:
|
104
|
+
|
105
|
+
1. The origin of this software must not be misrepresented; you must not
|
106
|
+
claim that you wrote the original software. If you use this software
|
107
|
+
in a product, an acknowledgment in the product documentation would be
|
108
|
+
appreciated but is not required.
|
109
|
+
2. Altered source versions must be plainly marked as such, and must not be
|
110
|
+
misrepresented as being the original software.
|
111
|
+
3. This notice may not be removed or altered from any source distribution.
|
112
|
+
|
113
|
+
Jean-loup Gailly Mark Adler
|
114
|
+
jloup@gzip.org madler@alumni.caltech.edu
|
115
|
+
|
116
|
+
If you use the zlib library in a product, we would appreciate *not*
|
117
|
+
receiving lengthy legal documents to sign. The sources are provided
|
118
|
+
for free but without warranty of any kind. The library has been
|
119
|
+
entirely written by Jean-loup Gailly and Mark Adler; it does not
|
120
|
+
include third-party code.
|
121
|
+
|
122
|
+
If you redistribute modified sources, we would appreciate that you include
|
123
|
+
in the file ChangeLog history information documenting your changes. Please
|
124
|
+
read the FAQ for more information on the distribution of modified source
|
125
|
+
versions.
|
@@ -0,0 +1,149 @@
|
|
1
|
+
/* adler32.c -- compute the Adler-32 checksum of a data stream
|
2
|
+
* Copyright (C) 1995-2004 Mark Adler
|
3
|
+
* For conditions of distribution and use, see copyright notice in zlib.h
|
4
|
+
*/
|
5
|
+
|
6
|
+
/* @(#) $Id$ */
|
7
|
+
|
8
|
+
#define ZLIB_INTERNAL
|
9
|
+
#include "zlib.h"
|
10
|
+
|
11
|
+
#define BASE 65521UL /* largest prime smaller than 65536 */
|
12
|
+
#define NMAX 5552
|
13
|
+
/* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
|
14
|
+
|
15
|
+
#define DO1(buf,i) {adler += (buf)[i]; sum2 += adler;}
|
16
|
+
#define DO2(buf,i) DO1(buf,i); DO1(buf,i+1);
|
17
|
+
#define DO4(buf,i) DO2(buf,i); DO2(buf,i+2);
|
18
|
+
#define DO8(buf,i) DO4(buf,i); DO4(buf,i+4);
|
19
|
+
#define DO16(buf) DO8(buf,0); DO8(buf,8);
|
20
|
+
|
21
|
+
/* use NO_DIVIDE if your processor does not do division in hardware */
|
22
|
+
#ifdef NO_DIVIDE
|
23
|
+
# define MOD(a) \
|
24
|
+
do { \
|
25
|
+
if (a >= (BASE << 16)) a -= (BASE << 16); \
|
26
|
+
if (a >= (BASE << 15)) a -= (BASE << 15); \
|
27
|
+
if (a >= (BASE << 14)) a -= (BASE << 14); \
|
28
|
+
if (a >= (BASE << 13)) a -= (BASE << 13); \
|
29
|
+
if (a >= (BASE << 12)) a -= (BASE << 12); \
|
30
|
+
if (a >= (BASE << 11)) a -= (BASE << 11); \
|
31
|
+
if (a >= (BASE << 10)) a -= (BASE << 10); \
|
32
|
+
if (a >= (BASE << 9)) a -= (BASE << 9); \
|
33
|
+
if (a >= (BASE << 8)) a -= (BASE << 8); \
|
34
|
+
if (a >= (BASE << 7)) a -= (BASE << 7); \
|
35
|
+
if (a >= (BASE << 6)) a -= (BASE << 6); \
|
36
|
+
if (a >= (BASE << 5)) a -= (BASE << 5); \
|
37
|
+
if (a >= (BASE << 4)) a -= (BASE << 4); \
|
38
|
+
if (a >= (BASE << 3)) a -= (BASE << 3); \
|
39
|
+
if (a >= (BASE << 2)) a -= (BASE << 2); \
|
40
|
+
if (a >= (BASE << 1)) a -= (BASE << 1); \
|
41
|
+
if (a >= BASE) a -= BASE; \
|
42
|
+
} while (0)
|
43
|
+
# define MOD4(a) \
|
44
|
+
do { \
|
45
|
+
if (a >= (BASE << 4)) a -= (BASE << 4); \
|
46
|
+
if (a >= (BASE << 3)) a -= (BASE << 3); \
|
47
|
+
if (a >= (BASE << 2)) a -= (BASE << 2); \
|
48
|
+
if (a >= (BASE << 1)) a -= (BASE << 1); \
|
49
|
+
if (a >= BASE) a -= BASE; \
|
50
|
+
} while (0)
|
51
|
+
#else
|
52
|
+
# define MOD(a) a %= BASE
|
53
|
+
# define MOD4(a) a %= BASE
|
54
|
+
#endif
|
55
|
+
|
56
|
+
/* ========================================================================= */
|
57
|
+
uLong ZEXPORT adler32(adler, buf, len)
|
58
|
+
uLong adler;
|
59
|
+
const Bytef *buf;
|
60
|
+
uInt len;
|
61
|
+
{
|
62
|
+
unsigned long sum2;
|
63
|
+
unsigned n;
|
64
|
+
|
65
|
+
/* split Adler-32 into component sums */
|
66
|
+
sum2 = (adler >> 16) & 0xffff;
|
67
|
+
adler &= 0xffff;
|
68
|
+
|
69
|
+
/* in case user likes doing a byte at a time, keep it fast */
|
70
|
+
if (len == 1) {
|
71
|
+
adler += buf[0];
|
72
|
+
if (adler >= BASE)
|
73
|
+
adler -= BASE;
|
74
|
+
sum2 += adler;
|
75
|
+
if (sum2 >= BASE)
|
76
|
+
sum2 -= BASE;
|
77
|
+
return adler | (sum2 << 16);
|
78
|
+
}
|
79
|
+
|
80
|
+
/* initial Adler-32 value (deferred check for len == 1 speed) */
|
81
|
+
if (buf == Z_NULL)
|
82
|
+
return 1L;
|
83
|
+
|
84
|
+
/* in case short lengths are provided, keep it somewhat fast */
|
85
|
+
if (len < 16) {
|
86
|
+
while (len--) {
|
87
|
+
adler += *buf++;
|
88
|
+
sum2 += adler;
|
89
|
+
}
|
90
|
+
if (adler >= BASE)
|
91
|
+
adler -= BASE;
|
92
|
+
MOD4(sum2); /* only added so many BASE's */
|
93
|
+
return adler | (sum2 << 16);
|
94
|
+
}
|
95
|
+
|
96
|
+
/* do length NMAX blocks -- requires just one modulo operation */
|
97
|
+
while (len >= NMAX) {
|
98
|
+
len -= NMAX;
|
99
|
+
n = NMAX / 16; /* NMAX is divisible by 16 */
|
100
|
+
do {
|
101
|
+
DO16(buf); /* 16 sums unrolled */
|
102
|
+
buf += 16;
|
103
|
+
} while (--n);
|
104
|
+
MOD(adler);
|
105
|
+
MOD(sum2);
|
106
|
+
}
|
107
|
+
|
108
|
+
/* do remaining bytes (less than NMAX, still just one modulo) */
|
109
|
+
if (len) { /* avoid modulos if none remaining */
|
110
|
+
while (len >= 16) {
|
111
|
+
len -= 16;
|
112
|
+
DO16(buf);
|
113
|
+
buf += 16;
|
114
|
+
}
|
115
|
+
while (len--) {
|
116
|
+
adler += *buf++;
|
117
|
+
sum2 += adler;
|
118
|
+
}
|
119
|
+
MOD(adler);
|
120
|
+
MOD(sum2);
|
121
|
+
}
|
122
|
+
|
123
|
+
/* return recombined sums */
|
124
|
+
return adler | (sum2 << 16);
|
125
|
+
}
|
126
|
+
|
127
|
+
/* ========================================================================= */
|
128
|
+
uLong ZEXPORT adler32_combine(adler1, adler2, len2)
|
129
|
+
uLong adler1;
|
130
|
+
uLong adler2;
|
131
|
+
z_off_t len2;
|
132
|
+
{
|
133
|
+
unsigned long sum1;
|
134
|
+
unsigned long sum2;
|
135
|
+
unsigned rem;
|
136
|
+
|
137
|
+
/* the derivation of this formula is left as an exercise for the reader */
|
138
|
+
rem = (unsigned)(len2 % BASE);
|
139
|
+
sum1 = adler1 & 0xffff;
|
140
|
+
sum2 = rem * sum1;
|
141
|
+
MOD(sum2);
|
142
|
+
sum1 += (adler2 & 0xffff) + BASE - 1;
|
143
|
+
sum2 += ((adler1 >> 16) & 0xffff) + ((adler2 >> 16) & 0xffff) + BASE - rem;
|
144
|
+
if (sum1 > BASE) sum1 -= BASE;
|
145
|
+
if (sum1 > BASE) sum1 -= BASE;
|
146
|
+
if (sum2 > (BASE << 1)) sum2 -= (BASE << 1);
|
147
|
+
if (sum2 > BASE) sum2 -= BASE;
|
148
|
+
return sum1 | (sum2 << 16);
|
149
|
+
}
|
@@ -0,0 +1,209 @@
|
|
1
|
+
1. Compression algorithm (deflate)
|
2
|
+
|
3
|
+
The deflation algorithm used by gzip (also zip and zlib) is a variation of
|
4
|
+
LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in
|
5
|
+
the input data. The second occurrence of a string is replaced by a
|
6
|
+
pointer to the previous string, in the form of a pair (distance,
|
7
|
+
length). Distances are limited to 32K bytes, and lengths are limited
|
8
|
+
to 258 bytes. When a string does not occur anywhere in the previous
|
9
|
+
32K bytes, it is emitted as a sequence of literal bytes. (In this
|
10
|
+
description, `string' must be taken as an arbitrary sequence of bytes,
|
11
|
+
and is not restricted to printable characters.)
|
12
|
+
|
13
|
+
Literals or match lengths are compressed with one Huffman tree, and
|
14
|
+
match distances are compressed with another tree. The trees are stored
|
15
|
+
in a compact form at the start of each block. The blocks can have any
|
16
|
+
size (except that the compressed data for one block must fit in
|
17
|
+
available memory). A block is terminated when deflate() determines that
|
18
|
+
it would be useful to start another block with fresh trees. (This is
|
19
|
+
somewhat similar to the behavior of LZW-based _compress_.)
|
20
|
+
|
21
|
+
Duplicated strings are found using a hash table. All input strings of
|
22
|
+
length 3 are inserted in the hash table. A hash index is computed for
|
23
|
+
the next 3 bytes. If the hash chain for this index is not empty, all
|
24
|
+
strings in the chain are compared with the current input string, and
|
25
|
+
the longest match is selected.
|
26
|
+
|
27
|
+
The hash chains are searched starting with the most recent strings, to
|
28
|
+
favor small distances and thus take advantage of the Huffman encoding.
|
29
|
+
The hash chains are singly linked. There are no deletions from the
|
30
|
+
hash chains, the algorithm simply discards matches that are too old.
|
31
|
+
|
32
|
+
To avoid a worst-case situation, very long hash chains are arbitrarily
|
33
|
+
truncated at a certain length, determined by a runtime option (level
|
34
|
+
parameter of deflateInit). So deflate() does not always find the longest
|
35
|
+
possible match but generally finds a match which is long enough.
|
36
|
+
|
37
|
+
deflate() also defers the selection of matches with a lazy evaluation
|
38
|
+
mechanism. After a match of length N has been found, deflate() searches for
|
39
|
+
a longer match at the next input byte. If a longer match is found, the
|
40
|
+
previous match is truncated to a length of one (thus producing a single
|
41
|
+
literal byte) and the process of lazy evaluation begins again. Otherwise,
|
42
|
+
the original match is kept, and the next match search is attempted only N
|
43
|
+
steps later.
|
44
|
+
|
45
|
+
The lazy match evaluation is also subject to a runtime parameter. If
|
46
|
+
the current match is long enough, deflate() reduces the search for a longer
|
47
|
+
match, thus speeding up the whole process. If compression ratio is more
|
48
|
+
important than speed, deflate() attempts a complete second search even if
|
49
|
+
the first match is already long enough.
|
50
|
+
|
51
|
+
The lazy match evaluation is not performed for the fastest compression
|
52
|
+
modes (level parameter 1 to 3). For these fast modes, new strings
|
53
|
+
are inserted in the hash table only when no match was found, or
|
54
|
+
when the match is not too long. This degrades the compression ratio
|
55
|
+
but saves time since there are both fewer insertions and fewer searches.
|
56
|
+
|
57
|
+
|
58
|
+
2. Decompression algorithm (inflate)
|
59
|
+
|
60
|
+
2.1 Introduction
|
61
|
+
|
62
|
+
The key question is how to represent a Huffman code (or any prefix code) so
|
63
|
+
that you can decode fast. The most important characteristic is that shorter
|
64
|
+
codes are much more common than longer codes, so pay attention to decoding the
|
65
|
+
short codes fast, and let the long codes take longer to decode.
|
66
|
+
|
67
|
+
inflate() sets up a first level table that covers some number of bits of
|
68
|
+
input less than the length of longest code. It gets that many bits from the
|
69
|
+
stream, and looks it up in the table. The table will tell if the next
|
70
|
+
code is that many bits or less and how many, and if it is, it will tell
|
71
|
+
the value, else it will point to the next level table for which inflate()
|
72
|
+
grabs more bits and tries to decode a longer code.
|
73
|
+
|
74
|
+
How many bits to make the first lookup is a tradeoff between the time it
|
75
|
+
takes to decode and the time it takes to build the table. If building the
|
76
|
+
table took no time (and if you had infinite memory), then there would only
|
77
|
+
be a first level table to cover all the way to the longest code. However,
|
78
|
+
building the table ends up taking a lot longer for more bits since short
|
79
|
+
codes are replicated many times in such a table. What inflate() does is
|
80
|
+
simply to make the number of bits in the first table a variable, and then
|
81
|
+
to set that variable for the maximum speed.
|
82
|
+
|
83
|
+
For inflate, which has 286 possible codes for the literal/length tree, the size
|
84
|
+
of the first table is nine bits. Also the distance trees have 30 possible
|
85
|
+
values, and the size of the first table is six bits. Note that for each of
|
86
|
+
those cases, the table ended up one bit longer than the ``average'' code
|
87
|
+
length, i.e. the code length of an approximately flat code which would be a
|
88
|
+
little more than eight bits for 286 symbols and a little less than five bits
|
89
|
+
for 30 symbols.
|
90
|
+
|
91
|
+
|
92
|
+
2.2 More details on the inflate table lookup
|
93
|
+
|
94
|
+
Ok, you want to know what this cleverly obfuscated inflate tree actually
|
95
|
+
looks like. You are correct that it's not a Huffman tree. It is simply a
|
96
|
+
lookup table for the first, let's say, nine bits of a Huffman symbol. The
|
97
|
+
symbol could be as short as one bit or as long as 15 bits. If a particular
|
98
|
+
symbol is shorter than nine bits, then that symbol's translation is duplicated
|
99
|
+
in all those entries that start with that symbol's bits. For example, if the
|
100
|
+
symbol is four bits, then it's duplicated 32 times in a nine-bit table. If a
|
101
|
+
symbol is nine bits long, it appears in the table once.
|
102
|
+
|
103
|
+
If the symbol is longer than nine bits, then that entry in the table points
|
104
|
+
to another similar table for the remaining bits. Again, there are duplicated
|
105
|
+
entries as needed. The idea is that most of the time the symbol will be short
|
106
|
+
and there will only be one table look up. (That's whole idea behind data
|
107
|
+
compression in the first place.) For the less frequent long symbols, there
|
108
|
+
will be two lookups. If you had a compression method with really long
|
109
|
+
symbols, you could have as many levels of lookups as is efficient. For
|
110
|
+
inflate, two is enough.
|
111
|
+
|
112
|
+
So a table entry either points to another table (in which case nine bits in
|
113
|
+
the above example are gobbled), or it contains the translation for the symbol
|
114
|
+
and the number of bits to gobble. Then you start again with the next
|
115
|
+
ungobbled bit.
|
116
|
+
|
117
|
+
You may wonder: why not just have one lookup table for how ever many bits the
|
118
|
+
longest symbol is? The reason is that if you do that, you end up spending
|
119
|
+
more time filling in duplicate symbol entries than you do actually decoding.
|
120
|
+
At least for deflate's output that generates new trees every several 10's of
|
121
|
+
kbytes. You can imagine that filling in a 2^15 entry table for a 15-bit code
|
122
|
+
would take too long if you're only decoding several thousand symbols. At the
|
123
|
+
other extreme, you could make a new table for every bit in the code. In fact,
|
124
|
+
that's essentially a Huffman tree. But then you spend two much time
|
125
|
+
traversing the tree while decoding, even for short symbols.
|
126
|
+
|
127
|
+
So the number of bits for the first lookup table is a trade of the time to
|
128
|
+
fill out the table vs. the time spent looking at the second level and above of
|
129
|
+
the table.
|
130
|
+
|
131
|
+
Here is an example, scaled down:
|
132
|
+
|
133
|
+
The code being decoded, with 10 symbols, from 1 to 6 bits long:
|
134
|
+
|
135
|
+
A: 0
|
136
|
+
B: 10
|
137
|
+
C: 1100
|
138
|
+
D: 11010
|
139
|
+
E: 11011
|
140
|
+
F: 11100
|
141
|
+
G: 11101
|
142
|
+
H: 11110
|
143
|
+
I: 111110
|
144
|
+
J: 111111
|
145
|
+
|
146
|
+
Let's make the first table three bits long (eight entries):
|
147
|
+
|
148
|
+
000: A,1
|
149
|
+
001: A,1
|
150
|
+
010: A,1
|
151
|
+
011: A,1
|
152
|
+
100: B,2
|
153
|
+
101: B,2
|
154
|
+
110: -> table X (gobble 3 bits)
|
155
|
+
111: -> table Y (gobble 3 bits)
|
156
|
+
|
157
|
+
Each entry is what the bits decode as and how many bits that is, i.e. how
|
158
|
+
many bits to gobble. Or the entry points to another table, with the number of
|
159
|
+
bits to gobble implicit in the size of the table.
|
160
|
+
|
161
|
+
Table X is two bits long since the longest code starting with 110 is five bits
|
162
|
+
long:
|
163
|
+
|
164
|
+
00: C,1
|
165
|
+
01: C,1
|
166
|
+
10: D,2
|
167
|
+
11: E,2
|
168
|
+
|
169
|
+
Table Y is three bits long since the longest code starting with 111 is six
|
170
|
+
bits long:
|
171
|
+
|
172
|
+
000: F,2
|
173
|
+
001: F,2
|
174
|
+
010: G,2
|
175
|
+
011: G,2
|
176
|
+
100: H,2
|
177
|
+
101: H,2
|
178
|
+
110: I,3
|
179
|
+
111: J,3
|
180
|
+
|
181
|
+
So what we have here are three tables with a total of 20 entries that had to
|
182
|
+
be constructed. That's compared to 64 entries for a single table. Or
|
183
|
+
compared to 16 entries for a Huffman tree (six two entry tables and one four
|
184
|
+
entry table). Assuming that the code ideally represents the probability of
|
185
|
+
the symbols, it takes on the average 1.25 lookups per symbol. That's compared
|
186
|
+
to one lookup for the single table, or 1.66 lookups per symbol for the
|
187
|
+
Huffman tree.
|
188
|
+
|
189
|
+
There, I think that gives you a picture of what's going on. For inflate, the
|
190
|
+
meaning of a particular symbol is often more than just a letter. It can be a
|
191
|
+
byte (a "literal"), or it can be either a length or a distance which
|
192
|
+
indicates a base value and a number of bits to fetch after the code that is
|
193
|
+
added to the base value. Or it might be the special end-of-block code. The
|
194
|
+
data structures created in inftrees.c try to encode all that information
|
195
|
+
compactly in the tables.
|
196
|
+
|
197
|
+
|
198
|
+
Jean-loup Gailly Mark Adler
|
199
|
+
jloup@gzip.org madler@alumni.caltech.edu
|
200
|
+
|
201
|
+
|
202
|
+
References:
|
203
|
+
|
204
|
+
[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data
|
205
|
+
Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3,
|
206
|
+
pp. 337-343.
|
207
|
+
|
208
|
+
``DEFLATE Compressed Data Format Specification'' available in
|
209
|
+
http://www.ietf.org/rfc/rfc1951.txt
|