omnizip 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/.rubocop.yml +32 -0
- data/.rubocop_todo.yml +754 -0
- data/COPYING +502 -0
- data/Gemfile +17 -0
- data/LICENSE +12 -0
- data/README.adoc +1045 -0
- data/Rakefile +12 -0
- data/benchmark/README.md +260 -0
- data/benchmark/benchmark_suite.rb +125 -0
- data/benchmark/compression_bench.rb +181 -0
- data/benchmark/filter_bench.rb +180 -0
- data/benchmark/models/benchmark_result.rb +59 -0
- data/benchmark/models/comparison_result.rb +69 -0
- data/benchmark/profile_suite.rb +167 -0
- data/benchmark/reporter.rb +150 -0
- data/benchmark/run_benchmarks.rb +66 -0
- data/benchmark/test_data.rb +137 -0
- data/config/formats/rar3_spec.yml +91 -0
- data/config/formats/rar5_spec.yml +102 -0
- data/docs/.github/workflows/docs.yml +142 -0
- data/docs/.gitignore +21 -0
- data/docs/.lychee.toml +67 -0
- data/docs/Gemfile +13 -0
- data/docs/RAR_WRITE_SUPPORT.md +26 -0
- data/docs/README.md +101 -0
- data/docs/_config.yml +112 -0
- data/docs/assets/logo.svg +1 -0
- data/docs/assets/omnizip-logo.pdf +1540 -11
- data/docs/comparison/feature-matrix.adoc +694 -0
- data/docs/comparison/index.adoc +113 -0
- data/docs/comparison/vs-7zip.adoc +309 -0
- data/docs/comparison/vs-peazip.adoc +77 -0
- data/docs/comparison/vs-rubyzip.adoc +342 -0
- data/docs/comparison/vs-winrar.adoc +100 -0
- data/docs/compatibility.adoc +579 -0
- data/docs/concepts/index.adoc +129 -0
- data/docs/developer/architecture.adoc +256 -0
- data/docs/developer/contributing.adoc +158 -0
- data/docs/developer/index.adoc +25 -0
- data/docs/developer/testing.adoc +212 -0
- data/docs/getting-started/basic-usage.adoc +271 -0
- data/docs/getting-started/index.adoc +42 -0
- data/docs/getting-started/installation.adoc +138 -0
- data/docs/getting-started/quick-start.adoc +185 -0
- data/docs/getting-started/your-first-archive.adoc +218 -0
- data/docs/guides/advanced-features/encryption.adoc +300 -0
- data/docs/guides/advanced-features/index.adoc +49 -0
- data/docs/guides/advanced-features/parallel-processing.adoc +246 -0
- data/docs/guides/advanced-features/progress-tracking.adoc +320 -0
- data/docs/guides/advanced-features/streaming.adoc +212 -0
- data/docs/guides/archive-formats/gzip-format.adoc +107 -0
- data/docs/guides/archive-formats/index.adoc +130 -0
- data/docs/guides/archive-formats/rar-format.adoc +104 -0
- data/docs/guides/archive-formats/rar5.adoc +521 -0
- data/docs/guides/archive-formats/seven-zip-format.adoc +35 -0
- data/docs/guides/archive-formats/tar-format.adoc +106 -0
- data/docs/guides/archive-formats/xz-format.adoc +118 -0
- data/docs/guides/archive-formats/zip-format.adoc +35 -0
- data/docs/guides/compression-algorithms/bzip2.adoc +113 -0
- data/docs/guides/compression-algorithms/deflate.adoc +319 -0
- data/docs/guides/compression-algorithms/index.adoc +190 -0
- data/docs/guides/compression-algorithms/lzma.adoc +398 -0
- data/docs/guides/compression-algorithms/lzma2.adoc +327 -0
- data/docs/guides/compression-algorithms/ppmd.adoc +316 -0
- data/docs/guides/compression-algorithms/zstandard.adoc +361 -0
- data/docs/guides/creating-archives.adoc +354 -0
- data/docs/guides/extracting-archives.adoc +53 -0
- data/docs/guides/format-conversion.adoc +64 -0
- data/docs/guides/index.adoc +49 -0
- data/docs/guides/migration-rubyzip.adoc +217 -0
- data/docs/guides/parity-archives.adoc +605 -0
- data/docs/guides/performance-tuning.adoc +88 -0
- data/docs/index.adoc +218 -0
- data/docs/lychee.toml +67 -0
- data/docs/reference/api/overview.adoc +188 -0
- data/docs/reference/cli/compress-command.adoc +114 -0
- data/docs/reference/cli/overview.adoc +140 -0
- data/docs/reference/index.adoc +26 -0
- data/docs/resources/faq.adoc +185 -0
- data/docs/resources/quick-reference.adoc +222 -0
- data/docs/troubleshooting/index.adoc +208 -0
- data/examples/api_comparison.rb +205 -0
- data/examples/deflate64_example.rb +96 -0
- data/examples/par2_demo.rb +121 -0
- data/examples/quick_start_native.rb +150 -0
- data/examples/quick_start_rubyzip.rb +115 -0
- data/examples/rubyzip_compatibility_demo.rb +194 -0
- data/exe/omnizip +27 -0
- data/lib/omnizip/algorithm.rb +130 -0
- data/lib/omnizip/algorithm_registry.rb +86 -0
- data/lib/omnizip/algorithms/.keep +0 -0
- data/lib/omnizip/algorithms/bzip2/bwt.rb +225 -0
- data/lib/omnizip/algorithms/bzip2/decoder.rb +193 -0
- data/lib/omnizip/algorithms/bzip2/encoder.rb +237 -0
- data/lib/omnizip/algorithms/bzip2/huffman.rb +206 -0
- data/lib/omnizip/algorithms/bzip2/mtf.rb +101 -0
- data/lib/omnizip/algorithms/bzip2/rle.rb +151 -0
- data/lib/omnizip/algorithms/bzip2.rb +130 -0
- data/lib/omnizip/algorithms/deflate/constants.rb +28 -0
- data/lib/omnizip/algorithms/deflate/decoder.rb +38 -0
- data/lib/omnizip/algorithms/deflate/encoder.rb +46 -0
- data/lib/omnizip/algorithms/deflate.rb +128 -0
- data/lib/omnizip/algorithms/deflate64/constants.rb +45 -0
- data/lib/omnizip/algorithms/deflate64/decoder.rb +153 -0
- data/lib/omnizip/algorithms/deflate64/encoder.rb +98 -0
- data/lib/omnizip/algorithms/deflate64/huffman_coder.rb +354 -0
- data/lib/omnizip/algorithms/deflate64/lz77_encoder.rb +142 -0
- data/lib/omnizip/algorithms/deflate64.rb +109 -0
- data/lib/omnizip/algorithms/lzma/bit_model.rb +120 -0
- data/lib/omnizip/algorithms/lzma/constants.rb +112 -0
- data/lib/omnizip/algorithms/lzma/decoder.rb +148 -0
- data/lib/omnizip/algorithms/lzma/dictionary.rb +69 -0
- data/lib/omnizip/algorithms/lzma/distance_coder.rb +415 -0
- data/lib/omnizip/algorithms/lzma/encoder.rb +142 -0
- data/lib/omnizip/algorithms/lzma/length_coder.rb +260 -0
- data/lib/omnizip/algorithms/lzma/literal_decoder.rb +320 -0
- data/lib/omnizip/algorithms/lzma/literal_encoder.rb +210 -0
- data/lib/omnizip/algorithms/lzma/lzip_decoder.rb +341 -0
- data/lib/omnizip/algorithms/lzma/lzma_alone_decoder.rb +192 -0
- data/lib/omnizip/algorithms/lzma/lzma_state.rb +128 -0
- data/lib/omnizip/algorithms/lzma/match.rb +32 -0
- data/lib/omnizip/algorithms/lzma/match_finder.rb +205 -0
- data/lib/omnizip/algorithms/lzma/match_finder_config.rb +142 -0
- data/lib/omnizip/algorithms/lzma/match_finder_factory.rb +88 -0
- data/lib/omnizip/algorithms/lzma/optimal_encoder.rb +130 -0
- data/lib/omnizip/algorithms/lzma/probability_models.rb +72 -0
- data/lib/omnizip/algorithms/lzma/range_coder.rb +85 -0
- data/lib/omnizip/algorithms/lzma/range_decoder.rb +434 -0
- data/lib/omnizip/algorithms/lzma/range_encoder.rb +194 -0
- data/lib/omnizip/algorithms/lzma/state.rb +127 -0
- data/lib/omnizip/algorithms/lzma/xz_buffered_range_encoder.rb +325 -0
- data/lib/omnizip/algorithms/lzma/xz_encoder.rb +426 -0
- data/lib/omnizip/algorithms/lzma/xz_encoder_fast.rb +645 -0
- data/lib/omnizip/algorithms/lzma/xz_match_finder_adapter.rb +227 -0
- data/lib/omnizip/algorithms/lzma/xz_price_calculator.rb +169 -0
- data/lib/omnizip/algorithms/lzma/xz_probability_models.rb +261 -0
- data/lib/omnizip/algorithms/lzma/xz_range_encoder.rb +223 -0
- data/lib/omnizip/algorithms/lzma/xz_range_encoder_exact.rb +331 -0
- data/lib/omnizip/algorithms/lzma/xz_state.rb +116 -0
- data/lib/omnizip/algorithms/lzma/xz_utils_decoder.rb +2055 -0
- data/lib/omnizip/algorithms/lzma.rb +238 -0
- data/lib/omnizip/algorithms/lzma2/chunk_manager.rb +182 -0
- data/lib/omnizip/algorithms/lzma2/constants.rb +41 -0
- data/lib/omnizip/algorithms/lzma2/encoder.rb +147 -0
- data/lib/omnizip/algorithms/lzma2/lzma2_chunk.rb +161 -0
- data/lib/omnizip/algorithms/lzma2/properties.rb +179 -0
- data/lib/omnizip/algorithms/lzma2/simple_lzma2_encoder.rb +127 -0
- data/lib/omnizip/algorithms/lzma2/xz_encoder_adapter.rb +85 -0
- data/lib/omnizip/algorithms/lzma2.rb +141 -0
- data/lib/omnizip/algorithms/ppmd7/constants.rb +74 -0
- data/lib/omnizip/algorithms/ppmd7/context.rb +154 -0
- data/lib/omnizip/algorithms/ppmd7/decoder.rb +126 -0
- data/lib/omnizip/algorithms/ppmd7/encoder.rb +163 -0
- data/lib/omnizip/algorithms/ppmd7/model.rb +248 -0
- data/lib/omnizip/algorithms/ppmd7/symbol_state.rb +57 -0
- data/lib/omnizip/algorithms/ppmd7.rb +116 -0
- data/lib/omnizip/algorithms/ppmd8/constants.rb +61 -0
- data/lib/omnizip/algorithms/ppmd8/context.rb +34 -0
- data/lib/omnizip/algorithms/ppmd8/decoder.rb +107 -0
- data/lib/omnizip/algorithms/ppmd8/encoder.rb +138 -0
- data/lib/omnizip/algorithms/ppmd8/model.rb +250 -0
- data/lib/omnizip/algorithms/ppmd8/restoration_method.rb +78 -0
- data/lib/omnizip/algorithms/ppmd8.rb +82 -0
- data/lib/omnizip/algorithms/ppmd_base.rb +138 -0
- data/lib/omnizip/algorithms/sevenzip_lzma2.rb +123 -0
- data/lib/omnizip/algorithms/xz_lzma2.rb +118 -0
- data/lib/omnizip/algorithms/zstandard/constants.rb +25 -0
- data/lib/omnizip/algorithms/zstandard/decoder.rb +46 -0
- data/lib/omnizip/algorithms/zstandard/encoder.rb +51 -0
- data/lib/omnizip/algorithms/zstandard.rb +138 -0
- data/lib/omnizip/buffer/memory_archive.rb +251 -0
- data/lib/omnizip/buffer/memory_extractor.rb +224 -0
- data/lib/omnizip/buffer.rb +176 -0
- data/lib/omnizip/checksum_registry.rb +114 -0
- data/lib/omnizip/checksums/crc32.rb +100 -0
- data/lib/omnizip/checksums/crc64.rb +101 -0
- data/lib/omnizip/checksums/crc_base.rb +158 -0
- data/lib/omnizip/checksums/verifier.rb +131 -0
- data/lib/omnizip/chunked/memory_manager.rb +194 -0
- data/lib/omnizip/chunked/reader.rb +78 -0
- data/lib/omnizip/chunked/writer.rb +120 -0
- data/lib/omnizip/chunked.rb +129 -0
- data/lib/omnizip/cli/output_formatter.rb +104 -0
- data/lib/omnizip/cli.rb +572 -0
- data/lib/omnizip/commands/.keep +0 -0
- data/lib/omnizip/commands/archive_create_command.rb +427 -0
- data/lib/omnizip/commands/archive_extract_command.rb +272 -0
- data/lib/omnizip/commands/archive_list_command.rb +218 -0
- data/lib/omnizip/commands/archive_repair_command.rb +131 -0
- data/lib/omnizip/commands/archive_verify_command.rb +117 -0
- data/lib/omnizip/commands/compress_command.rb +117 -0
- data/lib/omnizip/commands/decompress_command.rb +120 -0
- data/lib/omnizip/commands/list_command.rb +53 -0
- data/lib/omnizip/commands/metadata_command.rb +153 -0
- data/lib/omnizip/commands/parity_create_command.rb +122 -0
- data/lib/omnizip/commands/parity_repair_command.rb +122 -0
- data/lib/omnizip/commands/parity_verify_command.rb +124 -0
- data/lib/omnizip/commands/profile_list_command.rb +56 -0
- data/lib/omnizip/commands/profile_show_command.rb +44 -0
- data/lib/omnizip/convenience.rb +359 -0
- data/lib/omnizip/converter/conversion_registry.rb +49 -0
- data/lib/omnizip/converter/conversion_strategy.rb +121 -0
- data/lib/omnizip/converter/seven_zip_to_zip_strategy.rb +97 -0
- data/lib/omnizip/converter/zip_to_seven_zip_strategy.rb +112 -0
- data/lib/omnizip/converter.rb +105 -0
- data/lib/omnizip/crypto/aes256/cipher.rb +100 -0
- data/lib/omnizip/crypto/aes256/constants.rb +28 -0
- data/lib/omnizip/crypto/aes256/key_derivation.rb +101 -0
- data/lib/omnizip/crypto/aes256.rb +102 -0
- data/lib/omnizip/error.rb +106 -0
- data/lib/omnizip/eta/exponential_smoothing_estimator.rb +98 -0
- data/lib/omnizip/eta/moving_average_estimator.rb +99 -0
- data/lib/omnizip/eta/rate_calculator.rb +104 -0
- data/lib/omnizip/eta/sample_history.rb +143 -0
- data/lib/omnizip/eta/time_estimator.rb +106 -0
- data/lib/omnizip/eta.rb +63 -0
- data/lib/omnizip/extraction/filter_chain.rb +177 -0
- data/lib/omnizip/extraction/glob_pattern.rb +140 -0
- data/lib/omnizip/extraction/pattern_matcher.rb +70 -0
- data/lib/omnizip/extraction/predicate_pattern.rb +52 -0
- data/lib/omnizip/extraction/regex_pattern.rb +50 -0
- data/lib/omnizip/extraction/selective_extractor.rb +240 -0
- data/lib/omnizip/extraction.rb +111 -0
- data/lib/omnizip/file_type/mime_classifier.rb +144 -0
- data/lib/omnizip/file_type.rb +113 -0
- data/lib/omnizip/filter.rb +139 -0
- data/lib/omnizip/filter_pipeline.rb +108 -0
- data/lib/omnizip/filter_registry.rb +166 -0
- data/lib/omnizip/filters/bcj.rb +279 -0
- data/lib/omnizip/filters/bcj2/constants.rb +53 -0
- data/lib/omnizip/filters/bcj2/decoder.rb +200 -0
- data/lib/omnizip/filters/bcj2/encoder.rb +61 -0
- data/lib/omnizip/filters/bcj2/stream_data.rb +93 -0
- data/lib/omnizip/filters/bcj2.rb +99 -0
- data/lib/omnizip/filters/bcj_arm.rb +176 -0
- data/lib/omnizip/filters/bcj_arm64.rb +244 -0
- data/lib/omnizip/filters/bcj_ia64.rb +196 -0
- data/lib/omnizip/filters/bcj_ppc.rb +190 -0
- data/lib/omnizip/filters/bcj_sparc.rb +176 -0
- data/lib/omnizip/filters/bcj_x86.rb +193 -0
- data/lib/omnizip/filters/delta.rb +196 -0
- data/lib/omnizip/filters/filter_base.rb +72 -0
- data/lib/omnizip/filters/registry.rb +123 -0
- data/lib/omnizip/filters/xz_delta.rb +258 -0
- data/lib/omnizip/format_detector.rb +162 -0
- data/lib/omnizip/format_registry.rb +59 -0
- data/lib/omnizip/formats/.keep +0 -0
- data/lib/omnizip/formats/bzip2_file.rb +172 -0
- data/lib/omnizip/formats/cpio/constants.rb +55 -0
- data/lib/omnizip/formats/cpio/entry.rb +385 -0
- data/lib/omnizip/formats/cpio/reader.rb +196 -0
- data/lib/omnizip/formats/cpio/writer.rb +234 -0
- data/lib/omnizip/formats/cpio.rb +140 -0
- data/lib/omnizip/formats/format_spec_loader.rb +230 -0
- data/lib/omnizip/formats/gzip.rb +238 -0
- data/lib/omnizip/formats/iso/directory_builder.rb +297 -0
- data/lib/omnizip/formats/iso/directory_record.rb +152 -0
- data/lib/omnizip/formats/iso/joliet.rb +204 -0
- data/lib/omnizip/formats/iso/path_table.rb +125 -0
- data/lib/omnizip/formats/iso/reader.rb +197 -0
- data/lib/omnizip/formats/iso/rock_ridge.rb +349 -0
- data/lib/omnizip/formats/iso/volume_builder.rb +320 -0
- data/lib/omnizip/formats/iso/volume_descriptor.rb +168 -0
- data/lib/omnizip/formats/iso/writer.rb +530 -0
- data/lib/omnizip/formats/iso.rb +140 -0
- data/lib/omnizip/formats/lzip.rb +175 -0
- data/lib/omnizip/formats/lzma_alone.rb +171 -0
- data/lib/omnizip/formats/rar/archive_repairer.rb +243 -0
- data/lib/omnizip/formats/rar/archive_verifier.rb +195 -0
- data/lib/omnizip/formats/rar/block_parser.rb +243 -0
- data/lib/omnizip/formats/rar/compression/bit_stream.rb +180 -0
- data/lib/omnizip/formats/rar/compression/dispatcher.rb +217 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/decoder.rb +216 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/encoder.rb +158 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/huffman_builder.rb +217 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/huffman_coder.rb +189 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/match_finder.rb +135 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/sliding_window.rb +165 -0
- data/lib/omnizip/formats/rar/compression/ppmd/context.rb +105 -0
- data/lib/omnizip/formats/rar/compression/ppmd/decoder.rb +219 -0
- data/lib/omnizip/formats/rar/compression/ppmd/encoder.rb +262 -0
- data/lib/omnizip/formats/rar/compression_method_registry.rb +106 -0
- data/lib/omnizip/formats/rar/constants.rb +82 -0
- data/lib/omnizip/formats/rar/decompressor.rb +238 -0
- data/lib/omnizip/formats/rar/external_writer.rb +312 -0
- data/lib/omnizip/formats/rar/header.rb +192 -0
- data/lib/omnizip/formats/rar/license_validator.rb +109 -0
- data/lib/omnizip/formats/rar/models/rar_archive.rb +77 -0
- data/lib/omnizip/formats/rar/models/rar_entry.rb +65 -0
- data/lib/omnizip/formats/rar/models/rar_volume.rb +56 -0
- data/lib/omnizip/formats/rar/parity_handler.rb +292 -0
- data/lib/omnizip/formats/rar/rar5/compression/lzma.rb +202 -0
- data/lib/omnizip/formats/rar/rar5/compression/lzss.rb +578 -0
- data/lib/omnizip/formats/rar/rar5/compression/store.rb +60 -0
- data/lib/omnizip/formats/rar/rar5/crc32.rb +39 -0
- data/lib/omnizip/formats/rar/rar5/encryption/aes256_cbc.rb +97 -0
- data/lib/omnizip/formats/rar/rar5/encryption/encryption_header.rb +114 -0
- data/lib/omnizip/formats/rar/rar5/encryption/encryption_manager.rb +166 -0
- data/lib/omnizip/formats/rar/rar5/encryption/key_derivation.rb +97 -0
- data/lib/omnizip/formats/rar/rar5/header.rb +187 -0
- data/lib/omnizip/formats/rar/rar5/models/encryption_options.rb +74 -0
- data/lib/omnizip/formats/rar/rar5/models/recovery_options.rb +63 -0
- data/lib/omnizip/formats/rar/rar5/models/solid_options.rb +63 -0
- data/lib/omnizip/formats/rar/rar5/models/volume_options.rb +74 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/ARCHITECTURE.md +290 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_manager.rb +264 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_splitter.rb +155 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_writer.rb +194 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_encoder.rb +109 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_manager.rb +142 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_stream.rb +121 -0
- data/lib/omnizip/formats/rar/rar5/vint.rb +65 -0
- data/lib/omnizip/formats/rar/rar5/writer.rb +466 -0
- data/lib/omnizip/formats/rar/rar_format_base.rb +241 -0
- data/lib/omnizip/formats/rar/reader.rb +366 -0
- data/lib/omnizip/formats/rar/recovery_record.rb +245 -0
- data/lib/omnizip/formats/rar/volume_manager.rb +168 -0
- data/lib/omnizip/formats/rar/writer.rb +431 -0
- data/lib/omnizip/formats/rar.rb +205 -0
- data/lib/omnizip/formats/rar3/compressor.rb +73 -0
- data/lib/omnizip/formats/rar3/decompressor.rb +66 -0
- data/lib/omnizip/formats/rar3/reader.rb +386 -0
- data/lib/omnizip/formats/rar3/writer.rb +219 -0
- data/lib/omnizip/formats/rar5/compressor.rb +73 -0
- data/lib/omnizip/formats/rar5/decompressor.rb +66 -0
- data/lib/omnizip/formats/rar5/reader.rb +342 -0
- data/lib/omnizip/formats/rar5/writer.rb +214 -0
- data/lib/omnizip/formats/seven_zip/coder_chain.rb +150 -0
- data/lib/omnizip/formats/seven_zip/constants.rb +126 -0
- data/lib/omnizip/formats/seven_zip/encoded_header.rb +114 -0
- data/lib/omnizip/formats/seven_zip/encrypted_header.rb +142 -0
- data/lib/omnizip/formats/seven_zip/file_collector.rb +144 -0
- data/lib/omnizip/formats/seven_zip/header.rb +106 -0
- data/lib/omnizip/formats/seven_zip/header_encryptor.rb +134 -0
- data/lib/omnizip/formats/seven_zip/header_writer.rb +466 -0
- data/lib/omnizip/formats/seven_zip/models/coder_info.rb +30 -0
- data/lib/omnizip/formats/seven_zip/models/file_entry.rb +58 -0
- data/lib/omnizip/formats/seven_zip/models/folder.rb +69 -0
- data/lib/omnizip/formats/seven_zip/models/stream_info.rb +42 -0
- data/lib/omnizip/formats/seven_zip/parser.rb +660 -0
- data/lib/omnizip/formats/seven_zip/reader.rb +458 -0
- data/lib/omnizip/formats/seven_zip/split_archive_reader.rb +632 -0
- data/lib/omnizip/formats/seven_zip/split_archive_writer.rb +315 -0
- data/lib/omnizip/formats/seven_zip/stream_compressor.rb +151 -0
- data/lib/omnizip/formats/seven_zip/stream_decompressor.rb +162 -0
- data/lib/omnizip/formats/seven_zip/writer.rb +740 -0
- data/lib/omnizip/formats/seven_zip.rb +93 -0
- data/lib/omnizip/formats/tar/constants.rb +73 -0
- data/lib/omnizip/formats/tar/entry.rb +94 -0
- data/lib/omnizip/formats/tar/header.rb +168 -0
- data/lib/omnizip/formats/tar/reader.rb +121 -0
- data/lib/omnizip/formats/tar/writer.rb +216 -0
- data/lib/omnizip/formats/tar.rb +84 -0
- data/lib/omnizip/formats/xz/reader.rb +116 -0
- data/lib/omnizip/formats/xz.rb +237 -0
- data/lib/omnizip/formats/xz_impl/block_decoder.rb +754 -0
- data/lib/omnizip/formats/xz_impl/block_encoder.rb +306 -0
- data/lib/omnizip/formats/xz_impl/block_header.rb +210 -0
- data/lib/omnizip/formats/xz_impl/block_header_parser.rb +186 -0
- data/lib/omnizip/formats/xz_impl/constants.rb +49 -0
- data/lib/omnizip/formats/xz_impl/index_decoder.rb +174 -0
- data/lib/omnizip/formats/xz_impl/index_encoder.rb +122 -0
- data/lib/omnizip/formats/xz_impl/stream_decoder.rb +468 -0
- data/lib/omnizip/formats/xz_impl/stream_encoder.rb +99 -0
- data/lib/omnizip/formats/xz_impl/stream_footer.rb +81 -0
- data/lib/omnizip/formats/xz_impl/stream_footer_parser.rb +117 -0
- data/lib/omnizip/formats/xz_impl/stream_header.rb +55 -0
- data/lib/omnizip/formats/xz_impl/stream_header_parser.rb +108 -0
- data/lib/omnizip/formats/xz_impl/vli.rb +128 -0
- data/lib/omnizip/formats/xz_impl/writer.rb +421 -0
- data/lib/omnizip/formats/zip/central_directory_header.rb +195 -0
- data/lib/omnizip/formats/zip/constants.rb +69 -0
- data/lib/omnizip/formats/zip/end_of_central_directory.rb +133 -0
- data/lib/omnizip/formats/zip/local_file_header.rb +138 -0
- data/lib/omnizip/formats/zip/reader.rb +250 -0
- data/lib/omnizip/formats/zip/unix_extra_field.rb +153 -0
- data/lib/omnizip/formats/zip/writer.rb +375 -0
- data/lib/omnizip/formats/zip/zip64_end_of_central_directory.rb +104 -0
- data/lib/omnizip/formats/zip/zip64_end_of_central_directory_locator.rb +66 -0
- data/lib/omnizip/formats/zip/zip64_extra_field.rb +114 -0
- data/lib/omnizip/formats/zip.rb +50 -0
- data/lib/omnizip/implementations/base/lzma2_decoder_base.rb +75 -0
- data/lib/omnizip/implementations/base/lzma2_encoder_base.rb +128 -0
- data/lib/omnizip/implementations/base/lzma_decoder_base.rb +83 -0
- data/lib/omnizip/implementations/base/lzma_encoder_base.rb +108 -0
- data/lib/omnizip/implementations/base/state_machine_base.rb +182 -0
- data/lib/omnizip/implementations/seven_zip/lzma/decoder.rb +421 -0
- data/lib/omnizip/implementations/seven_zip/lzma/encoder.rb +465 -0
- data/lib/omnizip/implementations/seven_zip/lzma/match_finder.rb +288 -0
- data/lib/omnizip/implementations/seven_zip/lzma/range_decoder.rb +200 -0
- data/lib/omnizip/implementations/seven_zip/lzma/range_encoder.rb +197 -0
- data/lib/omnizip/implementations/seven_zip/lzma/state_machine.rb +141 -0
- data/lib/omnizip/implementations/seven_zip/lzma2/encoder.rb +519 -0
- data/lib/omnizip/implementations/xz_utils/lzma2/decoder.rb +723 -0
- data/lib/omnizip/implementations/xz_utils/lzma2/encoder.rb +750 -0
- data/lib/omnizip/io/buffered_input.rb +146 -0
- data/lib/omnizip/io/buffered_output.rb +105 -0
- data/lib/omnizip/io/stream_manager.rb +115 -0
- data/lib/omnizip/link_handler/hard_link.rb +79 -0
- data/lib/omnizip/link_handler/symbolic_link.rb +74 -0
- data/lib/omnizip/link_handler.rb +124 -0
- data/lib/omnizip/metadata/archive_metadata.rb +114 -0
- data/lib/omnizip/metadata/entry_metadata.rb +146 -0
- data/lib/omnizip/metadata/metadata_editor.rb +171 -0
- data/lib/omnizip/metadata/metadata_registry.rb +64 -0
- data/lib/omnizip/metadata/metadata_validator.rb +99 -0
- data/lib/omnizip/metadata.rb +57 -0
- data/lib/omnizip/models/.keep +0 -0
- data/lib/omnizip/models/algorithm_metadata.rb +73 -0
- data/lib/omnizip/models/compression_options.rb +71 -0
- data/lib/omnizip/models/conversion_options.rb +87 -0
- data/lib/omnizip/models/conversion_result.rb +135 -0
- data/lib/omnizip/models/eta_result.rb +46 -0
- data/lib/omnizip/models/extraction_rule.rb +115 -0
- data/lib/omnizip/models/filter_chain.rb +144 -0
- data/lib/omnizip/models/filter_config.rb +183 -0
- data/lib/omnizip/models/match_result.rb +124 -0
- data/lib/omnizip/models/optimization_suggestion.rb +91 -0
- data/lib/omnizip/models/parallel_options.rb +104 -0
- data/lib/omnizip/models/performance_result.rb +79 -0
- data/lib/omnizip/models/profile_report.rb +82 -0
- data/lib/omnizip/models/progress_options.rb +38 -0
- data/lib/omnizip/models/split_options.rb +116 -0
- data/lib/omnizip/optimization_registry.rb +81 -0
- data/lib/omnizip/parallel/job_queue.rb +209 -0
- data/lib/omnizip/parallel/job_scheduler.rb +203 -0
- data/lib/omnizip/parallel/parallel_compressor.rb +347 -0
- data/lib/omnizip/parallel/parallel_extractor.rb +329 -0
- data/lib/omnizip/parallel/worker_pool.rb +223 -0
- data/lib/omnizip/parallel.rb +149 -0
- data/lib/omnizip/parity/chunked_block_processor.rb +196 -0
- data/lib/omnizip/parity/galois16.rb +145 -0
- data/lib/omnizip/parity/models/creator_packet.rb +73 -0
- data/lib/omnizip/parity/models/file_description_packet.rb +133 -0
- data/lib/omnizip/parity/models/ifsc_packet.rb +123 -0
- data/lib/omnizip/parity/models/main_packet.rb +128 -0
- data/lib/omnizip/parity/models/packet.rb +156 -0
- data/lib/omnizip/parity/models/packet_registry.rb +109 -0
- data/lib/omnizip/parity/models/recovery_slice_packet.rb +78 -0
- data/lib/omnizip/parity/par2_creator.rb +531 -0
- data/lib/omnizip/parity/par2_repairer.rb +407 -0
- data/lib/omnizip/parity/par2_verifier.rb +364 -0
- data/lib/omnizip/parity/par2cmdline_algorithm.rb +110 -0
- data/lib/omnizip/parity/par2cmdline_coefficients.rb +78 -0
- data/lib/omnizip/parity/reed_solomon_decoder.rb +266 -0
- data/lib/omnizip/parity/reed_solomon_encoder.rb +111 -0
- data/lib/omnizip/parity/reed_solomon_matrix.rb +342 -0
- data/lib/omnizip/parity.rb +186 -0
- data/lib/omnizip/password/encryption_registry.rb +65 -0
- data/lib/omnizip/password/encryption_strategy.rb +96 -0
- data/lib/omnizip/password/password_validator.rb +129 -0
- data/lib/omnizip/password/winzip_aes_strategy.rb +192 -0
- data/lib/omnizip/password/zip_crypto_strategy.rb +141 -0
- data/lib/omnizip/password.rb +87 -0
- data/lib/omnizip/pipe/stream_compressor.rb +124 -0
- data/lib/omnizip/pipe/stream_decompressor.rb +174 -0
- data/lib/omnizip/pipe.rb +121 -0
- data/lib/omnizip/platform/ntfs_streams.rb +201 -0
- data/lib/omnizip/platform.rb +189 -0
- data/lib/omnizip/profile/archive_profile.rb +39 -0
- data/lib/omnizip/profile/balanced_profile.rb +33 -0
- data/lib/omnizip/profile/binary_profile.rb +36 -0
- data/lib/omnizip/profile/compression_profile.rb +158 -0
- data/lib/omnizip/profile/custom_profile.rb +157 -0
- data/lib/omnizip/profile/fast_profile.rb +33 -0
- data/lib/omnizip/profile/maximum_profile.rb +33 -0
- data/lib/omnizip/profile/profile_detector.rb +110 -0
- data/lib/omnizip/profile/profile_registry.rb +161 -0
- data/lib/omnizip/profile/text_profile.rb +36 -0
- data/lib/omnizip/profile.rb +190 -0
- data/lib/omnizip/profiler/memory_profiler.rb +66 -0
- data/lib/omnizip/profiler/method_profiler.rb +49 -0
- data/lib/omnizip/profiler/report_generator.rb +169 -0
- data/lib/omnizip/profiler.rb +204 -0
- data/lib/omnizip/progress/callback_reporter.rb +36 -0
- data/lib/omnizip/progress/console_reporter.rb +62 -0
- data/lib/omnizip/progress/log_reporter.rb +91 -0
- data/lib/omnizip/progress/operation_progress.rb +118 -0
- data/lib/omnizip/progress/progress_bar.rb +156 -0
- data/lib/omnizip/progress/progress_reporter.rb +40 -0
- data/lib/omnizip/progress/progress_tracker.rb +190 -0
- data/lib/omnizip/progress/silent_reporter.rb +24 -0
- data/lib/omnizip/progress.rb +127 -0
- data/lib/omnizip/rubyzip_compat.rb +63 -0
- data/lib/omnizip/temp/safe_extract.rb +168 -0
- data/lib/omnizip/temp/temp_file.rb +124 -0
- data/lib/omnizip/temp/temp_file_pool.rb +109 -0
- data/lib/omnizip/temp.rb +181 -0
- data/lib/omnizip/version.rb +5 -0
- data/lib/omnizip/zip/entry.rb +156 -0
- data/lib/omnizip/zip/file.rb +485 -0
- data/lib/omnizip/zip/input_stream.rb +273 -0
- data/lib/omnizip/zip/output_stream.rb +324 -0
- data/lib/omnizip.rb +156 -0
- data/readme-docs/advanced-features.adoc +515 -0
- data/readme-docs/api-usage.adoc +444 -0
- data/readme-docs/architecture.adoc +449 -0
- data/readme-docs/archive-formats.adoc +479 -0
- data/readme-docs/cli-usage.adoc +222 -0
- data/readme-docs/compression-algorithms.adoc +442 -0
- data/readme-docs/compression-profiles.adoc +247 -0
- data/readme-docs/encryption-checksums.adoc +328 -0
- data/readme-docs/format-converter.adoc +325 -0
- data/readme-docs/installation.adoc +228 -0
- data/readme-docs/par2-archives.adoc +608 -0
- data/readme-docs/performance-profiler.adoc +389 -0
- data/readme-docs/preprocessing-filters.adoc +280 -0
- data/xz-file-format-1.2.1.txt +1174 -0
- metadata +617 -0
|
@@ -0,0 +1,327 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: LZMA2
|
|
3
|
+
nav_order: 2
|
|
4
|
+
parent: Compression Algorithms
|
|
5
|
+
grand_parent: Guides
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
== Purpose
|
|
9
|
+
|
|
10
|
+
LZMA2 is an enhanced version of LZMA designed for better multi-threading support and improved handling of incompressible data. It's the default compression algorithm for modern 7-Zip archives.
|
|
11
|
+
|
|
12
|
+
== Key Characteristics
|
|
13
|
+
|
|
14
|
+
[cols="1,3"]
|
|
15
|
+
|===
|
|
16
|
+
|Property |Value
|
|
17
|
+
|
|
18
|
+
|Compression Ratio
|
|
19
|
+
|Excellent (same as LZMA)
|
|
20
|
+
|
|
21
|
+
|Compression Speed
|
|
22
|
+
|Slow to medium (with parallel speedup)
|
|
23
|
+
|
|
24
|
+
|Decompression Speed
|
|
25
|
+
|Fast
|
|
26
|
+
|
|
27
|
+
|Memory Usage
|
|
28
|
+
|Medium to high
|
|
29
|
+
|
|
30
|
+
|Multi-Threading
|
|
31
|
+
|Excellent support
|
|
32
|
+
|
|
33
|
+
|Best For
|
|
34
|
+
|Multi-core systems, modern archives, mixed content
|
|
35
|
+
|===
|
|
36
|
+
|
|
37
|
+
== LZMA2 vs LZMA
|
|
38
|
+
|
|
39
|
+
[cols="2,1,1,3"]
|
|
40
|
+
|===
|
|
41
|
+
|Feature |LZMA |LZMA2 |Advantage
|
|
42
|
+
|
|
43
|
+
|Compression Ratio
|
|
44
|
+
|Excellent
|
|
45
|
+
|Excellent
|
|
46
|
+
|Same quality
|
|
47
|
+
|
|
48
|
+
|Multi-Threading
|
|
49
|
+
|Limited
|
|
50
|
+
|Excellent
|
|
51
|
+
|LZMA2 is parallelizable
|
|
52
|
+
|
|
53
|
+
|Incompressible Data
|
|
54
|
+
|Poor
|
|
55
|
+
|Good
|
|
56
|
+
|LZMA2 handles better
|
|
57
|
+
|
|
58
|
+
|Chunk Independence
|
|
59
|
+
|No
|
|
60
|
+
|Yes
|
|
61
|
+
|LZMA2 can reset dictionary
|
|
62
|
+
|
|
63
|
+
|Default in 7-Zip
|
|
64
|
+
|No
|
|
65
|
+
|Yes
|
|
66
|
+
|LZMA2 is modern default
|
|
67
|
+
|
|
68
|
+
|Streaming
|
|
69
|
+
|Limited
|
|
70
|
+
|Better
|
|
71
|
+
|LZMA2 designed for streaming
|
|
72
|
+
|===
|
|
73
|
+
|
|
74
|
+
== When to Use LZMA2
|
|
75
|
+
|
|
76
|
+
**Choose LZMA2 when**:
|
|
77
|
+
|
|
78
|
+
* You have multi-core CPU and want parallel compression
|
|
79
|
+
* Working with mixed content (text + binary + compressed)
|
|
80
|
+
* Creating modern 7z archives (it's the default)
|
|
81
|
+
* Need better streaming support
|
|
82
|
+
* Want future-proof archives
|
|
83
|
+
|
|
84
|
+
**Use LZMA instead when**:
|
|
85
|
+
|
|
86
|
+
* Single-threaded environment
|
|
87
|
+
* Maintaining compatibility with old 7-Zip versions (<9.20)
|
|
88
|
+
* Slightly better compression ratio is critical
|
|
89
|
+
|
|
90
|
+
== Basic Usage
|
|
91
|
+
|
|
92
|
+
=== Create LZMA2 Archive
|
|
93
|
+
|
|
94
|
+
[source,ruby]
|
|
95
|
+
----
|
|
96
|
+
# Create with LZMA2 (modern default)
|
|
97
|
+
Omnizip::Archive.create('backup.7z', format: :seven_zip) do |archive|
|
|
98
|
+
archive.compression = :lzma2 # Default for 7z
|
|
99
|
+
archive.level = 9
|
|
100
|
+
archive.add_directory('project/')
|
|
101
|
+
end
|
|
102
|
+
----
|
|
103
|
+
|
|
104
|
+
=== With Parallel Processing
|
|
105
|
+
|
|
106
|
+
Leverage multi-core CPUs:
|
|
107
|
+
|
|
108
|
+
[source,ruby]
|
|
109
|
+
----
|
|
110
|
+
# Use parallel processing with LZMA2
|
|
111
|
+
Omnizip::Archive.create('backup.7z', format: :seven_zip) do |archive|
|
|
112
|
+
archive.compression = :lzma2
|
|
113
|
+
archive.level = 7
|
|
114
|
+
archive.parallel = true
|
|
115
|
+
archive.threads = 4 # Use 4 cores
|
|
116
|
+
archive.add_directory('large_project/')
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
# Can achieve 2-3x speedup on multi-core systems
|
|
120
|
+
----
|
|
121
|
+
|
|
122
|
+
== Compression Levels
|
|
123
|
+
|
|
124
|
+
Same levels as LZMA (1-9) but with better parallelization:
|
|
125
|
+
|
|
126
|
+
[cols="1,2,2,2"]
|
|
127
|
+
|===
|
|
128
|
+
|Level |Dictionary |Single-Thread |Multi-Thread (4 cores)
|
|
129
|
+
|
|
130
|
+
|5
|
|
131
|
+
|16 MB
|
|
132
|
+
|30s
|
|
133
|
+
|12s (2.5x faster)
|
|
134
|
+
|
|
135
|
+
|7
|
|
136
|
+
|32 MB
|
|
137
|
+
|60s
|
|
138
|
+
|22s (2.7x faster)
|
|
139
|
+
|
|
140
|
+
|9
|
|
141
|
+
|64 MB
|
|
142
|
+
|180s
|
|
143
|
+
|65s (2.8x faster)
|
|
144
|
+
|===
|
|
145
|
+
|
|
146
|
+
== Advanced Configuration
|
|
147
|
+
|
|
148
|
+
=== Chunk Size
|
|
149
|
+
|
|
150
|
+
Control parallelization granularity:
|
|
151
|
+
|
|
152
|
+
[source,ruby]
|
|
153
|
+
----
|
|
154
|
+
# Smaller chunks = better parallelization, more overhead
|
|
155
|
+
Omnizip::Archive.create('backup.7z', format: :seven_zip) do |archive|
|
|
156
|
+
archive.compression = :lzma2
|
|
157
|
+
archive.chunk_size = 8.megabytes # Process 8 MB chunks
|
|
158
|
+
archive.parallel = true
|
|
159
|
+
archive.add_directory('data/')
|
|
160
|
+
end
|
|
161
|
+
|
|
162
|
+
# Larger chunks = less overhead, less parallel benefit
|
|
163
|
+
# Default: 64 MB (good balance)
|
|
164
|
+
----
|
|
165
|
+
|
|
166
|
+
=== Dictionary Reset
|
|
167
|
+
|
|
168
|
+
LZMA2 can reset dictionary for better incompressible data handling:
|
|
169
|
+
|
|
170
|
+
[source,ruby]
|
|
171
|
+
----
|
|
172
|
+
# Automatic dictionary reset for mixed content
|
|
173
|
+
Omnizip::Archive.create('mixed.7z', format: :seven_zip) do |archive|
|
|
174
|
+
archive.compression = :lzma2
|
|
175
|
+
archive.adaptive_dictionary = true # Reset on incompressible blocks
|
|
176
|
+
archive.add_directory('mixed_data/') # Text + images + binaries
|
|
177
|
+
end
|
|
178
|
+
----
|
|
179
|
+
|
|
180
|
+
== Performance Characteristics
|
|
181
|
+
|
|
182
|
+
=== Parallel Speedup
|
|
183
|
+
|
|
184
|
+
[cols="2,1,1,1,1"]
|
|
185
|
+
|===
|
|
186
|
+
|File Mix |1 Thread |2 Threads |4 Threads |8 Threads
|
|
187
|
+
|
|
188
|
+
|All text
|
|
189
|
+
|60s
|
|
190
|
+
|35s
|
|
191
|
+
|20s
|
|
192
|
+
|15s
|
|
193
|
+
|
|
194
|
+
|Mixed content
|
|
195
|
+
|45s
|
|
196
|
+
|28s
|
|
197
|
+
|18s
|
|
198
|
+
|14s
|
|
199
|
+
|
|
200
|
+
|Many small files
|
|
201
|
+
|30s
|
|
202
|
+
|18s
|
|
203
|
+
|12s
|
|
204
|
+
|10s
|
|
205
|
+
|
|
206
|
+
|Few large files
|
|
207
|
+
|90s
|
|
208
|
+
|52s
|
|
209
|
+
|30s
|
|
210
|
+
|22s
|
|
211
|
+
|===
|
|
212
|
+
|
|
213
|
+
=== Incompressible Data Handling
|
|
214
|
+
|
|
215
|
+
LZMA2 handles incompressible data more efficiently:
|
|
216
|
+
|
|
217
|
+
[source]
|
|
218
|
+
----
|
|
219
|
+
File Type LZMA Time LZMA2 Time Improvement
|
|
220
|
+
─────────────────────────────────────────────────────
|
|
221
|
+
JPEG images 45s 32s ~30% faster
|
|
222
|
+
Mixed archive 60s 40s ~33% faster
|
|
223
|
+
Video files 38s 25s ~34% faster
|
|
224
|
+
Pure text 30s 30s Same
|
|
225
|
+
----
|
|
226
|
+
|
|
227
|
+
== Use Cases
|
|
228
|
+
|
|
229
|
+
=== Multi-Core Server Backups
|
|
230
|
+
|
|
231
|
+
Maximize server CPU utilization:
|
|
232
|
+
|
|
233
|
+
[source,ruby]
|
|
234
|
+
----
|
|
235
|
+
# Server with 16 cores
|
|
236
|
+
Omnizip.compress_directory(
|
|
237
|
+
'/var/www/',
|
|
238
|
+
'www-backup.7z',
|
|
239
|
+
compression: :lzma2,
|
|
240
|
+
level: 9,
|
|
241
|
+
parallel: true,
|
|
242
|
+
threads: 16 # Use all cores
|
|
243
|
+
) do |progress|
|
|
244
|
+
puts "#{progress.percentage}% - #{progress.active_threads} active threads"
|
|
245
|
+
end
|
|
246
|
+
----
|
|
247
|
+
|
|
248
|
+
=== Large Dataset Compression
|
|
249
|
+
|
|
250
|
+
Process large datasets efficiently:
|
|
251
|
+
|
|
252
|
+
[source,ruby]
|
|
253
|
+
----
|
|
254
|
+
# Compress large research dataset
|
|
255
|
+
Omnizip::Archive.create('dataset.7z', format: :seven_zip) do |archive|
|
|
256
|
+
archive.compression = :lzma2
|
|
257
|
+
archive.level = 9
|
|
258
|
+
archive.solid = true # Better compression
|
|
259
|
+
archive.parallel = true
|
|
260
|
+
|
|
261
|
+
# Add gigabytes of data
|
|
262
|
+
archive.add_directory('research_data/')
|
|
263
|
+
end
|
|
264
|
+
----
|
|
265
|
+
|
|
266
|
+
=== Software Build Artifacts
|
|
267
|
+
|
|
268
|
+
Package build outputs with parallel compression:
|
|
269
|
+
|
|
270
|
+
[source,ruby]
|
|
271
|
+
----
|
|
272
|
+
# Package release after build
|
|
273
|
+
Omnizip.compress_directory(
|
|
274
|
+
'build/release/',
|
|
275
|
+
"app-#{version}.7z",
|
|
276
|
+
compression: :lzma2,
|
|
277
|
+
level: 7,
|
|
278
|
+
filter: :bcj_x86, # For executables
|
|
279
|
+
parallel: true
|
|
280
|
+
)
|
|
281
|
+
----
|
|
282
|
+
|
|
283
|
+
== Command-Line Usage
|
|
284
|
+
|
|
285
|
+
[source,bash]
|
|
286
|
+
----
|
|
287
|
+
# Create LZMA2 archive
|
|
288
|
+
$ omnizip archive create backup.7z files/ \
|
|
289
|
+
--compression lzma2 \
|
|
290
|
+
--level 9
|
|
291
|
+
|
|
292
|
+
# With parallel processing
|
|
293
|
+
$ omnizip archive create backup.7z files/ \
|
|
294
|
+
--compression lzma2 \
|
|
295
|
+
--level 9 \
|
|
296
|
+
--parallel \
|
|
297
|
+
--threads 8
|
|
298
|
+
|
|
299
|
+
# Show compression statistics
|
|
300
|
+
$ omnizip archive create backup.7z files/ \
|
|
301
|
+
--compression lzma2 \
|
|
302
|
+
--level 9 \
|
|
303
|
+
--stats
|
|
304
|
+
Algorithm: LZMA2, Level: 9
|
|
305
|
+
Dictionary: 64 MB
|
|
306
|
+
Threads: 8
|
|
307
|
+
Original: 1.2 GB
|
|
308
|
+
Compressed: 456 MB (62% reduction)
|
|
309
|
+
Time: 3m 45s (2.7x speedup from parallelization)
|
|
310
|
+
----
|
|
311
|
+
|
|
312
|
+
== Technical Details
|
|
313
|
+
|
|
314
|
+
**LZMA2 Improvements over LZMA**:
|
|
315
|
+
|
|
316
|
+
. **Chunk-Based**: Data divided into independent chunks for parallel processing
|
|
317
|
+
. **Dictionary Reset**: Can reset dictionary between chunks
|
|
318
|
+
. **Uncompressed Chunks**: Stores incompressible data without overhead
|
|
319
|
+
. **Better Streaming**: Designed for better streaming support
|
|
320
|
+
. **Error Recovery**: Better error isolation (chunk-level vs. stream-level)
|
|
321
|
+
|
|
322
|
+
== See Also
|
|
323
|
+
|
|
324
|
+
* link:lzma.html[LZMA] - Original algorithm
|
|
325
|
+
* link:zstandard.html[Zstandard] - Fast alternative
|
|
326
|
+
* link:../advanced-features/parallel-processing.html[Parallel Processing] - Maximize LZMA2 performance
|
|
327
|
+
* link:../../compatibility.html[Compatibility] - LZMA2 support across tools
|
|
@@ -0,0 +1,316 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: PPMd
|
|
3
|
+
nav_order: 6
|
|
4
|
+
parent: Compression Algorithms
|
|
5
|
+
grand_parent: Guides
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
== Purpose
|
|
9
|
+
|
|
10
|
+
PPMd7 (Prediction by Partial Matching, variant H revision 7) is a statistical compression algorithm that excels at compressing text files. It achieves excellent compression ratios on source code, documents, and repetitive text.
|
|
11
|
+
|
|
12
|
+
== Key Characteristics
|
|
13
|
+
|
|
14
|
+
[cols="1,3"]
|
|
15
|
+
|===
|
|
16
|
+
|Property |Value
|
|
17
|
+
|
|
18
|
+
|Compression Ratio
|
|
19
|
+
|Excellent (best for text)
|
|
20
|
+
|
|
21
|
+
|Compression Speed
|
|
22
|
+
|Slow
|
|
23
|
+
|
|
24
|
+
|Decompression Speed
|
|
25
|
+
|Slow
|
|
26
|
+
|
|
27
|
+
|Memory Usage
|
|
28
|
+
|High (configurable)
|
|
29
|
+
|
|
30
|
+
|Best For
|
|
31
|
+
|Text files, source code, XML/JSON, logs
|
|
32
|
+
|===
|
|
33
|
+
|
|
34
|
+
== When to Use PPMd7
|
|
35
|
+
|
|
36
|
+
**Choose PPMd7 when**:
|
|
37
|
+
|
|
38
|
+
* Compressing primarily text data
|
|
39
|
+
* Source code archives
|
|
40
|
+
* XML, JSON, or structured text files
|
|
41
|
+
* Log file archives
|
|
42
|
+
* Maximum text compression is needed
|
|
43
|
+
|
|
44
|
+
**Avoid PPMd7 when**:
|
|
45
|
+
|
|
46
|
+
* Working with binary data (use LZMA2)
|
|
47
|
+
* Working with already-compressed files
|
|
48
|
+
* Speed is more important than ratio
|
|
49
|
+
* Memory is very limited
|
|
50
|
+
|
|
51
|
+
== Basic Usage
|
|
52
|
+
|
|
53
|
+
=== Compress Text Files
|
|
54
|
+
|
|
55
|
+
[source,ruby]
|
|
56
|
+
----
|
|
57
|
+
# Compress source code with PPMd7
|
|
58
|
+
Omnizip::Archive.create('code.7z', format: :seven_zip) do |archive|
|
|
59
|
+
archive.compression = :ppmd7
|
|
60
|
+
archive.level = 6 # Default
|
|
61
|
+
archive.add_directory('src/')
|
|
62
|
+
end
|
|
63
|
+
----
|
|
64
|
+
|
|
65
|
+
=== Configure Memory
|
|
66
|
+
|
|
67
|
+
PPMd7 uses configurable memory (2^mem_size bytes):
|
|
68
|
+
|
|
69
|
+
[source,ruby]
|
|
70
|
+
----
|
|
71
|
+
# Configure PPMd7 memory and order
|
|
72
|
+
Omnizip::Archive.create('text.7z', format: :seven_zip) do |archive|
|
|
73
|
+
archive.compression = :ppmd7
|
|
74
|
+
archive.mem_size = 24 # 2^24 = 16 MB
|
|
75
|
+
archive.order = 6 # Context order (default)
|
|
76
|
+
archive.add_directory('documents/')
|
|
77
|
+
end
|
|
78
|
+
----
|
|
79
|
+
|
|
80
|
+
== Configuration Parameters
|
|
81
|
+
|
|
82
|
+
=== Memory Size
|
|
83
|
+
|
|
84
|
+
Controls context tree size (2^mem_size bytes):
|
|
85
|
+
|
|
86
|
+
[cols="1,1,2,2"]
|
|
87
|
+
|===
|
|
88
|
+
|mem_size |Memory |Compression |Best For
|
|
89
|
+
|
|
90
|
+
|20
|
|
91
|
+
|1 MB
|
|
92
|
+
|Good
|
|
93
|
+
|Small files, limited memory
|
|
94
|
+
|
|
95
|
+
|22
|
|
96
|
+
|4 MB
|
|
97
|
+
|Very Good
|
|
98
|
+
|Most text files
|
|
99
|
+
|
|
100
|
+
|24
|
|
101
|
+
|16 MB
|
|
102
|
+
|Excellent
|
|
103
|
+
|**Default** - large texts
|
|
104
|
+
|
|
105
|
+
|26
|
|
106
|
+
|64 MB
|
|
107
|
+
|Best
|
|
108
|
+
|Very large text corpora
|
|
109
|
+
|
|
110
|
+
|28
|
|
111
|
+
|256 MB
|
|
112
|
+
|Maximum
|
|
113
|
+
|Huge text archives
|
|
114
|
+
|===
|
|
115
|
+
|
|
116
|
+
=== Context Order
|
|
117
|
+
|
|
118
|
+
Number of previous characters used for prediction:
|
|
119
|
+
|
|
120
|
+
[cols="1,2,2"]
|
|
121
|
+
|===
|
|
122
|
+
|Order |Compression |Speed
|
|
123
|
+
|
|
124
|
+
|4
|
|
125
|
+
|Good
|
|
126
|
+
|Faster
|
|
127
|
+
|
|
128
|
+
|6
|
|
129
|
+
|Excellent
|
|
130
|
+
|**Default**
|
|
131
|
+
|
|
132
|
+
|8
|
|
133
|
+
|Best
|
|
134
|
+
|Slower
|
|
135
|
+
|
|
136
|
+
|16
|
|
137
|
+
|Maximum
|
|
138
|
+
|Very slow
|
|
139
|
+
|===
|
|
140
|
+
|
|
141
|
+
.Configuration Example
|
|
142
|
+
[source,ruby]
|
|
143
|
+
----
|
|
144
|
+
# For large source code archive
|
|
145
|
+
Omnizip::Archive.create('linux-kernel.7z', format: :seven_zip) do |archive|
|
|
146
|
+
archive.compression = :ppmd7
|
|
147
|
+
archive.mem_size = 26 # 64 MB memory
|
|
148
|
+
archive.order = 8 # Deep context
|
|
149
|
+
archive.add_directory('linux-6.0/')
|
|
150
|
+
end
|
|
151
|
+
----
|
|
152
|
+
|
|
153
|
+
== Performance Characteristics
|
|
154
|
+
|
|
155
|
+
=== Compression Ratios (Text Files)
|
|
156
|
+
|
|
157
|
+
[cols="2,1,1,1"]
|
|
158
|
+
|===
|
|
159
|
+
|Content Type |Deflate |LZMA2 |PPMd7
|
|
160
|
+
|
|
161
|
+
|Source Code (.rb, .js)
|
|
162
|
+
|65%
|
|
163
|
+
|72%
|
|
164
|
+
|78%
|
|
165
|
+
|
|
166
|
+
|XML/JSON
|
|
167
|
+
|68%
|
|
168
|
+
|75%
|
|
169
|
+
|82%
|
|
170
|
+
|
|
171
|
+
|Plain Text
|
|
172
|
+
|62%
|
|
173
|
+
|70%
|
|
174
|
+
|80%
|
|
175
|
+
|
|
176
|
+
|Logs (repetitive)
|
|
177
|
+
|70%
|
|
178
|
+
|76%
|
|
179
|
+
|85%
|
|
180
|
+
|===
|
|
181
|
+
|
|
182
|
+
=== Speed Comparison
|
|
183
|
+
|
|
184
|
+
For 10 MB text file:
|
|
185
|
+
|
|
186
|
+
[cols="2,1,1,1"]
|
|
187
|
+
|===
|
|
188
|
+
|Algorithm |Compress Time |Decompress Time |Ratio
|
|
189
|
+
|
|
190
|
+
|**PPMd7**
|
|
191
|
+
|25s
|
|
192
|
+
|18s
|
|
193
|
+
|80%
|
|
194
|
+
|
|
195
|
+
|LZMA2
|
|
196
|
+
|15s
|
|
197
|
+
|2s
|
|
198
|
+
|75%
|
|
199
|
+
|
|
200
|
+
|Deflate
|
|
201
|
+
|3s
|
|
202
|
+
|1s
|
|
203
|
+
|65%
|
|
204
|
+
|
|
205
|
+
|Zstandard
|
|
206
|
+
|1s
|
|
207
|
+
|0.5s
|
|
208
|
+
|68%
|
|
209
|
+
|===
|
|
210
|
+
|
|
211
|
+
== Common Use Cases
|
|
212
|
+
|
|
213
|
+
=== Source Code Archives
|
|
214
|
+
|
|
215
|
+
[source,ruby]
|
|
216
|
+
----
|
|
217
|
+
# Archive project source code
|
|
218
|
+
Omnizip::Archive.create("project-v1.0.7z", format: :seven_zip) do |archive|
|
|
219
|
+
archive.compression = :ppmd7
|
|
220
|
+
archive.mem_size = 24 # 16 MB
|
|
221
|
+
archive.order = 6
|
|
222
|
+
archive.add_directory('src/')
|
|
223
|
+
archive.add_directory('tests/')
|
|
224
|
+
archive.add_file('README.md')
|
|
225
|
+
end
|
|
226
|
+
----
|
|
227
|
+
|
|
228
|
+
=== Log File Compression
|
|
229
|
+
|
|
230
|
+
[source,ruby]
|
|
231
|
+
----
|
|
232
|
+
# Compress application logs with maximum ratio
|
|
233
|
+
log_files = Dir.glob('/var/log/app/*.log')
|
|
234
|
+
|
|
235
|
+
Omnizip.compress_files(
|
|
236
|
+
log_files,
|
|
237
|
+
"logs_#{Date.today}.7z",
|
|
238
|
+
compression: :ppmd7,
|
|
239
|
+
mem_size: 26, # 64 MB for large logs
|
|
240
|
+
order: 8
|
|
241
|
+
)
|
|
242
|
+
----
|
|
243
|
+
|
|
244
|
+
=== Documentation Archives
|
|
245
|
+
|
|
246
|
+
[source,ruby]
|
|
247
|
+
----
|
|
248
|
+
# Compress documentation and markup files
|
|
249
|
+
Omnizip::Archive.create('docs.7z', format: :seven_zip) do |archive|
|
|
250
|
+
archive.compression = :ppmd7
|
|
251
|
+
archive.add_directory('docs/')
|
|
252
|
+
archive.add_directory('wiki/')
|
|
253
|
+
archive.add_directory('examples/')
|
|
254
|
+
end
|
|
255
|
+
----
|
|
256
|
+
|
|
257
|
+
== PPMd7 vs PPMd8
|
|
258
|
+
|
|
259
|
+
Both are variants of PPMd with slight differences:
|
|
260
|
+
|
|
261
|
+
[cols="2,1,1,2"]
|
|
262
|
+
|===
|
|
263
|
+
|Characteristic |PPMd7 |PPMd8 |Recommendation
|
|
264
|
+
|
|
265
|
+
|Compression Ratio
|
|
266
|
+
|Excellent
|
|
267
|
+
|Excellent
|
|
268
|
+
|Similar quality
|
|
269
|
+
|
|
270
|
+
|Speed
|
|
271
|
+
|Slow
|
|
272
|
+
|Slow
|
|
273
|
+
|Similar performance
|
|
274
|
+
|
|
275
|
+
|Memory Usage
|
|
276
|
+
|High
|
|
277
|
+
|High
|
|
278
|
+
|Similar requirements
|
|
279
|
+
|
|
280
|
+
|Compatibility
|
|
281
|
+
|Wider
|
|
282
|
+
|Limited
|
|
283
|
+
|Use PPMd7 for compatibility
|
|
284
|
+
|
|
285
|
+
|Default in 7-Zip
|
|
286
|
+
|Yes
|
|
287
|
+
|No
|
|
288
|
+
|PPMd7 is standard
|
|
289
|
+
|===
|
|
290
|
+
|
|
291
|
+
**Recommendation**: Use PPMd7 unless you have specific reasons to use PPMd8.
|
|
292
|
+
|
|
293
|
+
== Optimization Tips
|
|
294
|
+
|
|
295
|
+
. **Match Memory to File Size**: Use larger memory for larger files
|
|
296
|
+
. **Higher Order for Repetitive Text**: Increase order for highly structured text
|
|
297
|
+
. **Use Solid Archives**: PPMd benefits greatly from solid compression
|
|
298
|
+
. **Combine with Preprocessing**: Generally not needed for text
|
|
299
|
+
. **Monitor Memory Usage**: Ensure sufficient RAM available
|
|
300
|
+
|
|
301
|
+
== Technical Details
|
|
302
|
+
|
|
303
|
+
**Algorithm Type**: Statistical context modeling
|
|
304
|
+
|
|
305
|
+
**Prediction Method**: Partial Matching with escape codes
|
|
306
|
+
|
|
307
|
+
**Context Length**: Configurable (order parameter)
|
|
308
|
+
|
|
309
|
+
**Memory Model**: Explicit memory size configuration
|
|
310
|
+
|
|
311
|
+
**Compatibility**: Standard PPMd variant H implementation
|
|
312
|
+
|
|
313
|
+
== See Also
|
|
314
|
+
|
|
315
|
+
* link:lzma2.html[LZMA2] - General-purpose alternative
|
|
316
|
+
* link:../../compatibility.html[Compatibility] - PPMd support across tools
|