omnizip 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/.rubocop.yml +32 -0
- data/.rubocop_todo.yml +754 -0
- data/COPYING +502 -0
- data/Gemfile +17 -0
- data/LICENSE +12 -0
- data/README.adoc +1045 -0
- data/Rakefile +12 -0
- data/benchmark/README.md +260 -0
- data/benchmark/benchmark_suite.rb +125 -0
- data/benchmark/compression_bench.rb +181 -0
- data/benchmark/filter_bench.rb +180 -0
- data/benchmark/models/benchmark_result.rb +59 -0
- data/benchmark/models/comparison_result.rb +69 -0
- data/benchmark/profile_suite.rb +167 -0
- data/benchmark/reporter.rb +150 -0
- data/benchmark/run_benchmarks.rb +66 -0
- data/benchmark/test_data.rb +137 -0
- data/config/formats/rar3_spec.yml +91 -0
- data/config/formats/rar5_spec.yml +102 -0
- data/docs/.github/workflows/docs.yml +142 -0
- data/docs/.gitignore +21 -0
- data/docs/.lychee.toml +67 -0
- data/docs/Gemfile +13 -0
- data/docs/RAR_WRITE_SUPPORT.md +26 -0
- data/docs/README.md +101 -0
- data/docs/_config.yml +112 -0
- data/docs/assets/logo.svg +1 -0
- data/docs/assets/omnizip-logo.pdf +1540 -11
- data/docs/comparison/feature-matrix.adoc +694 -0
- data/docs/comparison/index.adoc +113 -0
- data/docs/comparison/vs-7zip.adoc +309 -0
- data/docs/comparison/vs-peazip.adoc +77 -0
- data/docs/comparison/vs-rubyzip.adoc +342 -0
- data/docs/comparison/vs-winrar.adoc +100 -0
- data/docs/compatibility.adoc +579 -0
- data/docs/concepts/index.adoc +129 -0
- data/docs/developer/architecture.adoc +256 -0
- data/docs/developer/contributing.adoc +158 -0
- data/docs/developer/index.adoc +25 -0
- data/docs/developer/testing.adoc +212 -0
- data/docs/getting-started/basic-usage.adoc +271 -0
- data/docs/getting-started/index.adoc +42 -0
- data/docs/getting-started/installation.adoc +138 -0
- data/docs/getting-started/quick-start.adoc +185 -0
- data/docs/getting-started/your-first-archive.adoc +218 -0
- data/docs/guides/advanced-features/encryption.adoc +300 -0
- data/docs/guides/advanced-features/index.adoc +49 -0
- data/docs/guides/advanced-features/parallel-processing.adoc +246 -0
- data/docs/guides/advanced-features/progress-tracking.adoc +320 -0
- data/docs/guides/advanced-features/streaming.adoc +212 -0
- data/docs/guides/archive-formats/gzip-format.adoc +107 -0
- data/docs/guides/archive-formats/index.adoc +130 -0
- data/docs/guides/archive-formats/rar-format.adoc +104 -0
- data/docs/guides/archive-formats/rar5.adoc +521 -0
- data/docs/guides/archive-formats/seven-zip-format.adoc +35 -0
- data/docs/guides/archive-formats/tar-format.adoc +106 -0
- data/docs/guides/archive-formats/xz-format.adoc +118 -0
- data/docs/guides/archive-formats/zip-format.adoc +35 -0
- data/docs/guides/compression-algorithms/bzip2.adoc +113 -0
- data/docs/guides/compression-algorithms/deflate.adoc +319 -0
- data/docs/guides/compression-algorithms/index.adoc +190 -0
- data/docs/guides/compression-algorithms/lzma.adoc +398 -0
- data/docs/guides/compression-algorithms/lzma2.adoc +327 -0
- data/docs/guides/compression-algorithms/ppmd.adoc +316 -0
- data/docs/guides/compression-algorithms/zstandard.adoc +361 -0
- data/docs/guides/creating-archives.adoc +354 -0
- data/docs/guides/extracting-archives.adoc +53 -0
- data/docs/guides/format-conversion.adoc +64 -0
- data/docs/guides/index.adoc +49 -0
- data/docs/guides/migration-rubyzip.adoc +217 -0
- data/docs/guides/parity-archives.adoc +605 -0
- data/docs/guides/performance-tuning.adoc +88 -0
- data/docs/index.adoc +218 -0
- data/docs/lychee.toml +67 -0
- data/docs/reference/api/overview.adoc +188 -0
- data/docs/reference/cli/compress-command.adoc +114 -0
- data/docs/reference/cli/overview.adoc +140 -0
- data/docs/reference/index.adoc +26 -0
- data/docs/resources/faq.adoc +185 -0
- data/docs/resources/quick-reference.adoc +222 -0
- data/docs/troubleshooting/index.adoc +208 -0
- data/examples/api_comparison.rb +205 -0
- data/examples/deflate64_example.rb +96 -0
- data/examples/par2_demo.rb +121 -0
- data/examples/quick_start_native.rb +150 -0
- data/examples/quick_start_rubyzip.rb +115 -0
- data/examples/rubyzip_compatibility_demo.rb +194 -0
- data/exe/omnizip +27 -0
- data/lib/omnizip/algorithm.rb +130 -0
- data/lib/omnizip/algorithm_registry.rb +86 -0
- data/lib/omnizip/algorithms/.keep +0 -0
- data/lib/omnizip/algorithms/bzip2/bwt.rb +225 -0
- data/lib/omnizip/algorithms/bzip2/decoder.rb +193 -0
- data/lib/omnizip/algorithms/bzip2/encoder.rb +237 -0
- data/lib/omnizip/algorithms/bzip2/huffman.rb +206 -0
- data/lib/omnizip/algorithms/bzip2/mtf.rb +101 -0
- data/lib/omnizip/algorithms/bzip2/rle.rb +151 -0
- data/lib/omnizip/algorithms/bzip2.rb +130 -0
- data/lib/omnizip/algorithms/deflate/constants.rb +28 -0
- data/lib/omnizip/algorithms/deflate/decoder.rb +38 -0
- data/lib/omnizip/algorithms/deflate/encoder.rb +46 -0
- data/lib/omnizip/algorithms/deflate.rb +128 -0
- data/lib/omnizip/algorithms/deflate64/constants.rb +45 -0
- data/lib/omnizip/algorithms/deflate64/decoder.rb +153 -0
- data/lib/omnizip/algorithms/deflate64/encoder.rb +98 -0
- data/lib/omnizip/algorithms/deflate64/huffman_coder.rb +354 -0
- data/lib/omnizip/algorithms/deflate64/lz77_encoder.rb +142 -0
- data/lib/omnizip/algorithms/deflate64.rb +109 -0
- data/lib/omnizip/algorithms/lzma/bit_model.rb +120 -0
- data/lib/omnizip/algorithms/lzma/constants.rb +112 -0
- data/lib/omnizip/algorithms/lzma/decoder.rb +148 -0
- data/lib/omnizip/algorithms/lzma/dictionary.rb +69 -0
- data/lib/omnizip/algorithms/lzma/distance_coder.rb +415 -0
- data/lib/omnizip/algorithms/lzma/encoder.rb +142 -0
- data/lib/omnizip/algorithms/lzma/length_coder.rb +260 -0
- data/lib/omnizip/algorithms/lzma/literal_decoder.rb +320 -0
- data/lib/omnizip/algorithms/lzma/literal_encoder.rb +210 -0
- data/lib/omnizip/algorithms/lzma/lzip_decoder.rb +341 -0
- data/lib/omnizip/algorithms/lzma/lzma_alone_decoder.rb +192 -0
- data/lib/omnizip/algorithms/lzma/lzma_state.rb +128 -0
- data/lib/omnizip/algorithms/lzma/match.rb +32 -0
- data/lib/omnizip/algorithms/lzma/match_finder.rb +205 -0
- data/lib/omnizip/algorithms/lzma/match_finder_config.rb +142 -0
- data/lib/omnizip/algorithms/lzma/match_finder_factory.rb +88 -0
- data/lib/omnizip/algorithms/lzma/optimal_encoder.rb +130 -0
- data/lib/omnizip/algorithms/lzma/probability_models.rb +72 -0
- data/lib/omnizip/algorithms/lzma/range_coder.rb +85 -0
- data/lib/omnizip/algorithms/lzma/range_decoder.rb +434 -0
- data/lib/omnizip/algorithms/lzma/range_encoder.rb +194 -0
- data/lib/omnizip/algorithms/lzma/state.rb +127 -0
- data/lib/omnizip/algorithms/lzma/xz_buffered_range_encoder.rb +325 -0
- data/lib/omnizip/algorithms/lzma/xz_encoder.rb +426 -0
- data/lib/omnizip/algorithms/lzma/xz_encoder_fast.rb +645 -0
- data/lib/omnizip/algorithms/lzma/xz_match_finder_adapter.rb +227 -0
- data/lib/omnizip/algorithms/lzma/xz_price_calculator.rb +169 -0
- data/lib/omnizip/algorithms/lzma/xz_probability_models.rb +261 -0
- data/lib/omnizip/algorithms/lzma/xz_range_encoder.rb +223 -0
- data/lib/omnizip/algorithms/lzma/xz_range_encoder_exact.rb +331 -0
- data/lib/omnizip/algorithms/lzma/xz_state.rb +116 -0
- data/lib/omnizip/algorithms/lzma/xz_utils_decoder.rb +2055 -0
- data/lib/omnizip/algorithms/lzma.rb +238 -0
- data/lib/omnizip/algorithms/lzma2/chunk_manager.rb +182 -0
- data/lib/omnizip/algorithms/lzma2/constants.rb +41 -0
- data/lib/omnizip/algorithms/lzma2/encoder.rb +147 -0
- data/lib/omnizip/algorithms/lzma2/lzma2_chunk.rb +161 -0
- data/lib/omnizip/algorithms/lzma2/properties.rb +179 -0
- data/lib/omnizip/algorithms/lzma2/simple_lzma2_encoder.rb +127 -0
- data/lib/omnizip/algorithms/lzma2/xz_encoder_adapter.rb +85 -0
- data/lib/omnizip/algorithms/lzma2.rb +141 -0
- data/lib/omnizip/algorithms/ppmd7/constants.rb +74 -0
- data/lib/omnizip/algorithms/ppmd7/context.rb +154 -0
- data/lib/omnizip/algorithms/ppmd7/decoder.rb +126 -0
- data/lib/omnizip/algorithms/ppmd7/encoder.rb +163 -0
- data/lib/omnizip/algorithms/ppmd7/model.rb +248 -0
- data/lib/omnizip/algorithms/ppmd7/symbol_state.rb +57 -0
- data/lib/omnizip/algorithms/ppmd7.rb +116 -0
- data/lib/omnizip/algorithms/ppmd8/constants.rb +61 -0
- data/lib/omnizip/algorithms/ppmd8/context.rb +34 -0
- data/lib/omnizip/algorithms/ppmd8/decoder.rb +107 -0
- data/lib/omnizip/algorithms/ppmd8/encoder.rb +138 -0
- data/lib/omnizip/algorithms/ppmd8/model.rb +250 -0
- data/lib/omnizip/algorithms/ppmd8/restoration_method.rb +78 -0
- data/lib/omnizip/algorithms/ppmd8.rb +82 -0
- data/lib/omnizip/algorithms/ppmd_base.rb +138 -0
- data/lib/omnizip/algorithms/sevenzip_lzma2.rb +123 -0
- data/lib/omnizip/algorithms/xz_lzma2.rb +118 -0
- data/lib/omnizip/algorithms/zstandard/constants.rb +25 -0
- data/lib/omnizip/algorithms/zstandard/decoder.rb +46 -0
- data/lib/omnizip/algorithms/zstandard/encoder.rb +51 -0
- data/lib/omnizip/algorithms/zstandard.rb +138 -0
- data/lib/omnizip/buffer/memory_archive.rb +251 -0
- data/lib/omnizip/buffer/memory_extractor.rb +224 -0
- data/lib/omnizip/buffer.rb +176 -0
- data/lib/omnizip/checksum_registry.rb +114 -0
- data/lib/omnizip/checksums/crc32.rb +100 -0
- data/lib/omnizip/checksums/crc64.rb +101 -0
- data/lib/omnizip/checksums/crc_base.rb +158 -0
- data/lib/omnizip/checksums/verifier.rb +131 -0
- data/lib/omnizip/chunked/memory_manager.rb +194 -0
- data/lib/omnizip/chunked/reader.rb +78 -0
- data/lib/omnizip/chunked/writer.rb +120 -0
- data/lib/omnizip/chunked.rb +129 -0
- data/lib/omnizip/cli/output_formatter.rb +104 -0
- data/lib/omnizip/cli.rb +572 -0
- data/lib/omnizip/commands/.keep +0 -0
- data/lib/omnizip/commands/archive_create_command.rb +427 -0
- data/lib/omnizip/commands/archive_extract_command.rb +272 -0
- data/lib/omnizip/commands/archive_list_command.rb +218 -0
- data/lib/omnizip/commands/archive_repair_command.rb +131 -0
- data/lib/omnizip/commands/archive_verify_command.rb +117 -0
- data/lib/omnizip/commands/compress_command.rb +117 -0
- data/lib/omnizip/commands/decompress_command.rb +120 -0
- data/lib/omnizip/commands/list_command.rb +53 -0
- data/lib/omnizip/commands/metadata_command.rb +153 -0
- data/lib/omnizip/commands/parity_create_command.rb +122 -0
- data/lib/omnizip/commands/parity_repair_command.rb +122 -0
- data/lib/omnizip/commands/parity_verify_command.rb +124 -0
- data/lib/omnizip/commands/profile_list_command.rb +56 -0
- data/lib/omnizip/commands/profile_show_command.rb +44 -0
- data/lib/omnizip/convenience.rb +359 -0
- data/lib/omnizip/converter/conversion_registry.rb +49 -0
- data/lib/omnizip/converter/conversion_strategy.rb +121 -0
- data/lib/omnizip/converter/seven_zip_to_zip_strategy.rb +97 -0
- data/lib/omnizip/converter/zip_to_seven_zip_strategy.rb +112 -0
- data/lib/omnizip/converter.rb +105 -0
- data/lib/omnizip/crypto/aes256/cipher.rb +100 -0
- data/lib/omnizip/crypto/aes256/constants.rb +28 -0
- data/lib/omnizip/crypto/aes256/key_derivation.rb +101 -0
- data/lib/omnizip/crypto/aes256.rb +102 -0
- data/lib/omnizip/error.rb +106 -0
- data/lib/omnizip/eta/exponential_smoothing_estimator.rb +98 -0
- data/lib/omnizip/eta/moving_average_estimator.rb +99 -0
- data/lib/omnizip/eta/rate_calculator.rb +104 -0
- data/lib/omnizip/eta/sample_history.rb +143 -0
- data/lib/omnizip/eta/time_estimator.rb +106 -0
- data/lib/omnizip/eta.rb +63 -0
- data/lib/omnizip/extraction/filter_chain.rb +177 -0
- data/lib/omnizip/extraction/glob_pattern.rb +140 -0
- data/lib/omnizip/extraction/pattern_matcher.rb +70 -0
- data/lib/omnizip/extraction/predicate_pattern.rb +52 -0
- data/lib/omnizip/extraction/regex_pattern.rb +50 -0
- data/lib/omnizip/extraction/selective_extractor.rb +240 -0
- data/lib/omnizip/extraction.rb +111 -0
- data/lib/omnizip/file_type/mime_classifier.rb +144 -0
- data/lib/omnizip/file_type.rb +113 -0
- data/lib/omnizip/filter.rb +139 -0
- data/lib/omnizip/filter_pipeline.rb +108 -0
- data/lib/omnizip/filter_registry.rb +166 -0
- data/lib/omnizip/filters/bcj.rb +279 -0
- data/lib/omnizip/filters/bcj2/constants.rb +53 -0
- data/lib/omnizip/filters/bcj2/decoder.rb +200 -0
- data/lib/omnizip/filters/bcj2/encoder.rb +61 -0
- data/lib/omnizip/filters/bcj2/stream_data.rb +93 -0
- data/lib/omnizip/filters/bcj2.rb +99 -0
- data/lib/omnizip/filters/bcj_arm.rb +176 -0
- data/lib/omnizip/filters/bcj_arm64.rb +244 -0
- data/lib/omnizip/filters/bcj_ia64.rb +196 -0
- data/lib/omnizip/filters/bcj_ppc.rb +190 -0
- data/lib/omnizip/filters/bcj_sparc.rb +176 -0
- data/lib/omnizip/filters/bcj_x86.rb +193 -0
- data/lib/omnizip/filters/delta.rb +196 -0
- data/lib/omnizip/filters/filter_base.rb +72 -0
- data/lib/omnizip/filters/registry.rb +123 -0
- data/lib/omnizip/filters/xz_delta.rb +258 -0
- data/lib/omnizip/format_detector.rb +162 -0
- data/lib/omnizip/format_registry.rb +59 -0
- data/lib/omnizip/formats/.keep +0 -0
- data/lib/omnizip/formats/bzip2_file.rb +172 -0
- data/lib/omnizip/formats/cpio/constants.rb +55 -0
- data/lib/omnizip/formats/cpio/entry.rb +385 -0
- data/lib/omnizip/formats/cpio/reader.rb +196 -0
- data/lib/omnizip/formats/cpio/writer.rb +234 -0
- data/lib/omnizip/formats/cpio.rb +140 -0
- data/lib/omnizip/formats/format_spec_loader.rb +230 -0
- data/lib/omnizip/formats/gzip.rb +238 -0
- data/lib/omnizip/formats/iso/directory_builder.rb +297 -0
- data/lib/omnizip/formats/iso/directory_record.rb +152 -0
- data/lib/omnizip/formats/iso/joliet.rb +204 -0
- data/lib/omnizip/formats/iso/path_table.rb +125 -0
- data/lib/omnizip/formats/iso/reader.rb +197 -0
- data/lib/omnizip/formats/iso/rock_ridge.rb +349 -0
- data/lib/omnizip/formats/iso/volume_builder.rb +320 -0
- data/lib/omnizip/formats/iso/volume_descriptor.rb +168 -0
- data/lib/omnizip/formats/iso/writer.rb +530 -0
- data/lib/omnizip/formats/iso.rb +140 -0
- data/lib/omnizip/formats/lzip.rb +175 -0
- data/lib/omnizip/formats/lzma_alone.rb +171 -0
- data/lib/omnizip/formats/rar/archive_repairer.rb +243 -0
- data/lib/omnizip/formats/rar/archive_verifier.rb +195 -0
- data/lib/omnizip/formats/rar/block_parser.rb +243 -0
- data/lib/omnizip/formats/rar/compression/bit_stream.rb +180 -0
- data/lib/omnizip/formats/rar/compression/dispatcher.rb +217 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/decoder.rb +216 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/encoder.rb +158 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/huffman_builder.rb +217 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/huffman_coder.rb +189 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/match_finder.rb +135 -0
- data/lib/omnizip/formats/rar/compression/lz77_huffman/sliding_window.rb +165 -0
- data/lib/omnizip/formats/rar/compression/ppmd/context.rb +105 -0
- data/lib/omnizip/formats/rar/compression/ppmd/decoder.rb +219 -0
- data/lib/omnizip/formats/rar/compression/ppmd/encoder.rb +262 -0
- data/lib/omnizip/formats/rar/compression_method_registry.rb +106 -0
- data/lib/omnizip/formats/rar/constants.rb +82 -0
- data/lib/omnizip/formats/rar/decompressor.rb +238 -0
- data/lib/omnizip/formats/rar/external_writer.rb +312 -0
- data/lib/omnizip/formats/rar/header.rb +192 -0
- data/lib/omnizip/formats/rar/license_validator.rb +109 -0
- data/lib/omnizip/formats/rar/models/rar_archive.rb +77 -0
- data/lib/omnizip/formats/rar/models/rar_entry.rb +65 -0
- data/lib/omnizip/formats/rar/models/rar_volume.rb +56 -0
- data/lib/omnizip/formats/rar/parity_handler.rb +292 -0
- data/lib/omnizip/formats/rar/rar5/compression/lzma.rb +202 -0
- data/lib/omnizip/formats/rar/rar5/compression/lzss.rb +578 -0
- data/lib/omnizip/formats/rar/rar5/compression/store.rb +60 -0
- data/lib/omnizip/formats/rar/rar5/crc32.rb +39 -0
- data/lib/omnizip/formats/rar/rar5/encryption/aes256_cbc.rb +97 -0
- data/lib/omnizip/formats/rar/rar5/encryption/encryption_header.rb +114 -0
- data/lib/omnizip/formats/rar/rar5/encryption/encryption_manager.rb +166 -0
- data/lib/omnizip/formats/rar/rar5/encryption/key_derivation.rb +97 -0
- data/lib/omnizip/formats/rar/rar5/header.rb +187 -0
- data/lib/omnizip/formats/rar/rar5/models/encryption_options.rb +74 -0
- data/lib/omnizip/formats/rar/rar5/models/recovery_options.rb +63 -0
- data/lib/omnizip/formats/rar/rar5/models/solid_options.rb +63 -0
- data/lib/omnizip/formats/rar/rar5/models/volume_options.rb +74 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/ARCHITECTURE.md +290 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_manager.rb +264 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_splitter.rb +155 -0
- data/lib/omnizip/formats/rar/rar5/multi_volume/volume_writer.rb +194 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_encoder.rb +109 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_manager.rb +142 -0
- data/lib/omnizip/formats/rar/rar5/solid/solid_stream.rb +121 -0
- data/lib/omnizip/formats/rar/rar5/vint.rb +65 -0
- data/lib/omnizip/formats/rar/rar5/writer.rb +466 -0
- data/lib/omnizip/formats/rar/rar_format_base.rb +241 -0
- data/lib/omnizip/formats/rar/reader.rb +366 -0
- data/lib/omnizip/formats/rar/recovery_record.rb +245 -0
- data/lib/omnizip/formats/rar/volume_manager.rb +168 -0
- data/lib/omnizip/formats/rar/writer.rb +431 -0
- data/lib/omnizip/formats/rar.rb +205 -0
- data/lib/omnizip/formats/rar3/compressor.rb +73 -0
- data/lib/omnizip/formats/rar3/decompressor.rb +66 -0
- data/lib/omnizip/formats/rar3/reader.rb +386 -0
- data/lib/omnizip/formats/rar3/writer.rb +219 -0
- data/lib/omnizip/formats/rar5/compressor.rb +73 -0
- data/lib/omnizip/formats/rar5/decompressor.rb +66 -0
- data/lib/omnizip/formats/rar5/reader.rb +342 -0
- data/lib/omnizip/formats/rar5/writer.rb +214 -0
- data/lib/omnizip/formats/seven_zip/coder_chain.rb +150 -0
- data/lib/omnizip/formats/seven_zip/constants.rb +126 -0
- data/lib/omnizip/formats/seven_zip/encoded_header.rb +114 -0
- data/lib/omnizip/formats/seven_zip/encrypted_header.rb +142 -0
- data/lib/omnizip/formats/seven_zip/file_collector.rb +144 -0
- data/lib/omnizip/formats/seven_zip/header.rb +106 -0
- data/lib/omnizip/formats/seven_zip/header_encryptor.rb +134 -0
- data/lib/omnizip/formats/seven_zip/header_writer.rb +466 -0
- data/lib/omnizip/formats/seven_zip/models/coder_info.rb +30 -0
- data/lib/omnizip/formats/seven_zip/models/file_entry.rb +58 -0
- data/lib/omnizip/formats/seven_zip/models/folder.rb +69 -0
- data/lib/omnizip/formats/seven_zip/models/stream_info.rb +42 -0
- data/lib/omnizip/formats/seven_zip/parser.rb +660 -0
- data/lib/omnizip/formats/seven_zip/reader.rb +458 -0
- data/lib/omnizip/formats/seven_zip/split_archive_reader.rb +632 -0
- data/lib/omnizip/formats/seven_zip/split_archive_writer.rb +315 -0
- data/lib/omnizip/formats/seven_zip/stream_compressor.rb +151 -0
- data/lib/omnizip/formats/seven_zip/stream_decompressor.rb +162 -0
- data/lib/omnizip/formats/seven_zip/writer.rb +740 -0
- data/lib/omnizip/formats/seven_zip.rb +93 -0
- data/lib/omnizip/formats/tar/constants.rb +73 -0
- data/lib/omnizip/formats/tar/entry.rb +94 -0
- data/lib/omnizip/formats/tar/header.rb +168 -0
- data/lib/omnizip/formats/tar/reader.rb +121 -0
- data/lib/omnizip/formats/tar/writer.rb +216 -0
- data/lib/omnizip/formats/tar.rb +84 -0
- data/lib/omnizip/formats/xz/reader.rb +116 -0
- data/lib/omnizip/formats/xz.rb +237 -0
- data/lib/omnizip/formats/xz_impl/block_decoder.rb +754 -0
- data/lib/omnizip/formats/xz_impl/block_encoder.rb +306 -0
- data/lib/omnizip/formats/xz_impl/block_header.rb +210 -0
- data/lib/omnizip/formats/xz_impl/block_header_parser.rb +186 -0
- data/lib/omnizip/formats/xz_impl/constants.rb +49 -0
- data/lib/omnizip/formats/xz_impl/index_decoder.rb +174 -0
- data/lib/omnizip/formats/xz_impl/index_encoder.rb +122 -0
- data/lib/omnizip/formats/xz_impl/stream_decoder.rb +468 -0
- data/lib/omnizip/formats/xz_impl/stream_encoder.rb +99 -0
- data/lib/omnizip/formats/xz_impl/stream_footer.rb +81 -0
- data/lib/omnizip/formats/xz_impl/stream_footer_parser.rb +117 -0
- data/lib/omnizip/formats/xz_impl/stream_header.rb +55 -0
- data/lib/omnizip/formats/xz_impl/stream_header_parser.rb +108 -0
- data/lib/omnizip/formats/xz_impl/vli.rb +128 -0
- data/lib/omnizip/formats/xz_impl/writer.rb +421 -0
- data/lib/omnizip/formats/zip/central_directory_header.rb +195 -0
- data/lib/omnizip/formats/zip/constants.rb +69 -0
- data/lib/omnizip/formats/zip/end_of_central_directory.rb +133 -0
- data/lib/omnizip/formats/zip/local_file_header.rb +138 -0
- data/lib/omnizip/formats/zip/reader.rb +250 -0
- data/lib/omnizip/formats/zip/unix_extra_field.rb +153 -0
- data/lib/omnizip/formats/zip/writer.rb +375 -0
- data/lib/omnizip/formats/zip/zip64_end_of_central_directory.rb +104 -0
- data/lib/omnizip/formats/zip/zip64_end_of_central_directory_locator.rb +66 -0
- data/lib/omnizip/formats/zip/zip64_extra_field.rb +114 -0
- data/lib/omnizip/formats/zip.rb +50 -0
- data/lib/omnizip/implementations/base/lzma2_decoder_base.rb +75 -0
- data/lib/omnizip/implementations/base/lzma2_encoder_base.rb +128 -0
- data/lib/omnizip/implementations/base/lzma_decoder_base.rb +83 -0
- data/lib/omnizip/implementations/base/lzma_encoder_base.rb +108 -0
- data/lib/omnizip/implementations/base/state_machine_base.rb +182 -0
- data/lib/omnizip/implementations/seven_zip/lzma/decoder.rb +421 -0
- data/lib/omnizip/implementations/seven_zip/lzma/encoder.rb +465 -0
- data/lib/omnizip/implementations/seven_zip/lzma/match_finder.rb +288 -0
- data/lib/omnizip/implementations/seven_zip/lzma/range_decoder.rb +200 -0
- data/lib/omnizip/implementations/seven_zip/lzma/range_encoder.rb +197 -0
- data/lib/omnizip/implementations/seven_zip/lzma/state_machine.rb +141 -0
- data/lib/omnizip/implementations/seven_zip/lzma2/encoder.rb +519 -0
- data/lib/omnizip/implementations/xz_utils/lzma2/decoder.rb +723 -0
- data/lib/omnizip/implementations/xz_utils/lzma2/encoder.rb +750 -0
- data/lib/omnizip/io/buffered_input.rb +146 -0
- data/lib/omnizip/io/buffered_output.rb +105 -0
- data/lib/omnizip/io/stream_manager.rb +115 -0
- data/lib/omnizip/link_handler/hard_link.rb +79 -0
- data/lib/omnizip/link_handler/symbolic_link.rb +74 -0
- data/lib/omnizip/link_handler.rb +124 -0
- data/lib/omnizip/metadata/archive_metadata.rb +114 -0
- data/lib/omnizip/metadata/entry_metadata.rb +146 -0
- data/lib/omnizip/metadata/metadata_editor.rb +171 -0
- data/lib/omnizip/metadata/metadata_registry.rb +64 -0
- data/lib/omnizip/metadata/metadata_validator.rb +99 -0
- data/lib/omnizip/metadata.rb +57 -0
- data/lib/omnizip/models/.keep +0 -0
- data/lib/omnizip/models/algorithm_metadata.rb +73 -0
- data/lib/omnizip/models/compression_options.rb +71 -0
- data/lib/omnizip/models/conversion_options.rb +87 -0
- data/lib/omnizip/models/conversion_result.rb +135 -0
- data/lib/omnizip/models/eta_result.rb +46 -0
- data/lib/omnizip/models/extraction_rule.rb +115 -0
- data/lib/omnizip/models/filter_chain.rb +144 -0
- data/lib/omnizip/models/filter_config.rb +183 -0
- data/lib/omnizip/models/match_result.rb +124 -0
- data/lib/omnizip/models/optimization_suggestion.rb +91 -0
- data/lib/omnizip/models/parallel_options.rb +104 -0
- data/lib/omnizip/models/performance_result.rb +79 -0
- data/lib/omnizip/models/profile_report.rb +82 -0
- data/lib/omnizip/models/progress_options.rb +38 -0
- data/lib/omnizip/models/split_options.rb +116 -0
- data/lib/omnizip/optimization_registry.rb +81 -0
- data/lib/omnizip/parallel/job_queue.rb +209 -0
- data/lib/omnizip/parallel/job_scheduler.rb +203 -0
- data/lib/omnizip/parallel/parallel_compressor.rb +347 -0
- data/lib/omnizip/parallel/parallel_extractor.rb +329 -0
- data/lib/omnizip/parallel/worker_pool.rb +223 -0
- data/lib/omnizip/parallel.rb +149 -0
- data/lib/omnizip/parity/chunked_block_processor.rb +196 -0
- data/lib/omnizip/parity/galois16.rb +145 -0
- data/lib/omnizip/parity/models/creator_packet.rb +73 -0
- data/lib/omnizip/parity/models/file_description_packet.rb +133 -0
- data/lib/omnizip/parity/models/ifsc_packet.rb +123 -0
- data/lib/omnizip/parity/models/main_packet.rb +128 -0
- data/lib/omnizip/parity/models/packet.rb +156 -0
- data/lib/omnizip/parity/models/packet_registry.rb +109 -0
- data/lib/omnizip/parity/models/recovery_slice_packet.rb +78 -0
- data/lib/omnizip/parity/par2_creator.rb +531 -0
- data/lib/omnizip/parity/par2_repairer.rb +407 -0
- data/lib/omnizip/parity/par2_verifier.rb +364 -0
- data/lib/omnizip/parity/par2cmdline_algorithm.rb +110 -0
- data/lib/omnizip/parity/par2cmdline_coefficients.rb +78 -0
- data/lib/omnizip/parity/reed_solomon_decoder.rb +266 -0
- data/lib/omnizip/parity/reed_solomon_encoder.rb +111 -0
- data/lib/omnizip/parity/reed_solomon_matrix.rb +342 -0
- data/lib/omnizip/parity.rb +186 -0
- data/lib/omnizip/password/encryption_registry.rb +65 -0
- data/lib/omnizip/password/encryption_strategy.rb +96 -0
- data/lib/omnizip/password/password_validator.rb +129 -0
- data/lib/omnizip/password/winzip_aes_strategy.rb +192 -0
- data/lib/omnizip/password/zip_crypto_strategy.rb +141 -0
- data/lib/omnizip/password.rb +87 -0
- data/lib/omnizip/pipe/stream_compressor.rb +124 -0
- data/lib/omnizip/pipe/stream_decompressor.rb +174 -0
- data/lib/omnizip/pipe.rb +121 -0
- data/lib/omnizip/platform/ntfs_streams.rb +201 -0
- data/lib/omnizip/platform.rb +189 -0
- data/lib/omnizip/profile/archive_profile.rb +39 -0
- data/lib/omnizip/profile/balanced_profile.rb +33 -0
- data/lib/omnizip/profile/binary_profile.rb +36 -0
- data/lib/omnizip/profile/compression_profile.rb +158 -0
- data/lib/omnizip/profile/custom_profile.rb +157 -0
- data/lib/omnizip/profile/fast_profile.rb +33 -0
- data/lib/omnizip/profile/maximum_profile.rb +33 -0
- data/lib/omnizip/profile/profile_detector.rb +110 -0
- data/lib/omnizip/profile/profile_registry.rb +161 -0
- data/lib/omnizip/profile/text_profile.rb +36 -0
- data/lib/omnizip/profile.rb +190 -0
- data/lib/omnizip/profiler/memory_profiler.rb +66 -0
- data/lib/omnizip/profiler/method_profiler.rb +49 -0
- data/lib/omnizip/profiler/report_generator.rb +169 -0
- data/lib/omnizip/profiler.rb +204 -0
- data/lib/omnizip/progress/callback_reporter.rb +36 -0
- data/lib/omnizip/progress/console_reporter.rb +62 -0
- data/lib/omnizip/progress/log_reporter.rb +91 -0
- data/lib/omnizip/progress/operation_progress.rb +118 -0
- data/lib/omnizip/progress/progress_bar.rb +156 -0
- data/lib/omnizip/progress/progress_reporter.rb +40 -0
- data/lib/omnizip/progress/progress_tracker.rb +190 -0
- data/lib/omnizip/progress/silent_reporter.rb +24 -0
- data/lib/omnizip/progress.rb +127 -0
- data/lib/omnizip/rubyzip_compat.rb +63 -0
- data/lib/omnizip/temp/safe_extract.rb +168 -0
- data/lib/omnizip/temp/temp_file.rb +124 -0
- data/lib/omnizip/temp/temp_file_pool.rb +109 -0
- data/lib/omnizip/temp.rb +181 -0
- data/lib/omnizip/version.rb +5 -0
- data/lib/omnizip/zip/entry.rb +156 -0
- data/lib/omnizip/zip/file.rb +485 -0
- data/lib/omnizip/zip/input_stream.rb +273 -0
- data/lib/omnizip/zip/output_stream.rb +324 -0
- data/lib/omnizip.rb +156 -0
- data/readme-docs/advanced-features.adoc +515 -0
- data/readme-docs/api-usage.adoc +444 -0
- data/readme-docs/architecture.adoc +449 -0
- data/readme-docs/archive-formats.adoc +479 -0
- data/readme-docs/cli-usage.adoc +222 -0
- data/readme-docs/compression-algorithms.adoc +442 -0
- data/readme-docs/compression-profiles.adoc +247 -0
- data/readme-docs/encryption-checksums.adoc +328 -0
- data/readme-docs/format-converter.adoc +325 -0
- data/readme-docs/installation.adoc +228 -0
- data/readme-docs/par2-archives.adoc +608 -0
- data/readme-docs/performance-profiler.adoc +389 -0
- data/readme-docs/preprocessing-filters.adoc +280 -0
- data/xz-file-format-1.2.1.txt +1174 -0
- metadata +617 -0
|
@@ -0,0 +1,389 @@
|
|
|
1
|
+
= Performance Profiler
|
|
2
|
+
:toc:
|
|
3
|
+
:toclevels: 3
|
|
4
|
+
|
|
5
|
+
== Purpose
|
|
6
|
+
|
|
7
|
+
The Performance Profiler provides comprehensive profiling and optimization tools to identify bottlenecks and improve compression performance.
|
|
8
|
+
|
|
9
|
+
== Features
|
|
10
|
+
|
|
11
|
+
* **Method profiling** - Track execution time and call counts
|
|
12
|
+
* **Memory profiling** - Monitor allocation and retention
|
|
13
|
+
* **Hot path analysis** - Identify performance bottlenecks
|
|
14
|
+
* **Optimization suggestions** - AI-powered recommendations
|
|
15
|
+
* **Report generation** - Formatted profiling reports
|
|
16
|
+
|
|
17
|
+
== Basic Profiling
|
|
18
|
+
|
|
19
|
+
=== Profile a Block of Code
|
|
20
|
+
|
|
21
|
+
[source,ruby]
|
|
22
|
+
----
|
|
23
|
+
# Simple profiling
|
|
24
|
+
result = Omnizip::Profiler.profile do
|
|
25
|
+
Omnizip::Formats::SevenZip::Writer.new('archive.7z') do |zip|
|
|
26
|
+
zip.add_file('large_file.dat')
|
|
27
|
+
end
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
puts "Execution time: #{result.total_time}s"
|
|
31
|
+
puts "Memory allocated: #{result.memory_allocated} bytes"
|
|
32
|
+
----
|
|
33
|
+
|
|
34
|
+
=== Profile with Custom Name
|
|
35
|
+
|
|
36
|
+
[source,ruby]
|
|
37
|
+
----
|
|
38
|
+
profiler = Omnizip::Profiler.new(profile_name: "compression_test")
|
|
39
|
+
|
|
40
|
+
profiler.profile("LZMA compression") do
|
|
41
|
+
algorithm = Omnizip::AlgorithmRegistry.get(:lzma).new(level: 9)
|
|
42
|
+
File.open('input.txt', 'rb') do |input|
|
|
43
|
+
File.open('output.lzma', 'wb') do |output|
|
|
44
|
+
algorithm.compress(input, output)
|
|
45
|
+
end
|
|
46
|
+
end
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
# Get profiling report
|
|
50
|
+
report = profiler.report
|
|
51
|
+
puts "Total execution time: #{report.total_execution_time}s"
|
|
52
|
+
----
|
|
53
|
+
|
|
54
|
+
== Hot Path Analysis
|
|
55
|
+
|
|
56
|
+
=== Identify Performance Bottlenecks
|
|
57
|
+
|
|
58
|
+
[source,ruby]
|
|
59
|
+
----
|
|
60
|
+
profiler = Omnizip::Profiler.new
|
|
61
|
+
|
|
62
|
+
# Profile multiple operations
|
|
63
|
+
profiler.profile("read_file") { File.read('data.txt') }
|
|
64
|
+
profiler.profile("compress") { compress_data(data) }
|
|
65
|
+
profiler.profile("write_file") { File.write('output.dat', compressed) }
|
|
66
|
+
|
|
67
|
+
# Analyze hot paths (operations >10% of total time)
|
|
68
|
+
hot_paths = profiler.analyze_hot_paths(threshold_percentage: 10.0)
|
|
69
|
+
|
|
70
|
+
hot_paths.each do |operation|
|
|
71
|
+
puts "Hot path: #{operation.operation_name}"
|
|
72
|
+
puts " Time: #{operation.total_time}s"
|
|
73
|
+
puts " Percentage: #{(operation.total_time / profiler.report.total_execution_time * 100).round(1)}%"
|
|
74
|
+
end
|
|
75
|
+
----
|
|
76
|
+
|
|
77
|
+
== Bottleneck Identification
|
|
78
|
+
|
|
79
|
+
=== Find CPU and Memory Bottlenecks
|
|
80
|
+
|
|
81
|
+
[source,ruby]
|
|
82
|
+
----
|
|
83
|
+
profiler = Omnizip::Profiler.new
|
|
84
|
+
|
|
85
|
+
# Profile compression pipeline
|
|
86
|
+
profiler.profile("BWT") { bwt_transform(data) }
|
|
87
|
+
profiler.profile("MTF") { mtf_encode(transformed) }
|
|
88
|
+
profiler.profile("Huffman") { huffman_encode(encoded) }
|
|
89
|
+
|
|
90
|
+
# Identify bottlenecks
|
|
91
|
+
bottlenecks = profiler.identify_bottlenecks
|
|
92
|
+
|
|
93
|
+
bottlenecks.each do |bottleneck|
|
|
94
|
+
case bottleneck[:type]
|
|
95
|
+
when :cpu
|
|
96
|
+
puts "CPU bottleneck: #{bottleneck[:operation]}"
|
|
97
|
+
puts " Time: #{bottleneck[:time]}s"
|
|
98
|
+
puts " Severity: #{bottleneck[:severity]}"
|
|
99
|
+
when :memory
|
|
100
|
+
puts "Memory bottleneck: #{bottleneck[:operation]}"
|
|
101
|
+
puts " Allocated: #{bottleneck[:allocated]} bytes"
|
|
102
|
+
when :gc
|
|
103
|
+
puts "GC pressure: #{bottleneck[:operation]}"
|
|
104
|
+
puts " GC pressure: #{bottleneck[:gc_pressure]}"
|
|
105
|
+
end
|
|
106
|
+
end
|
|
107
|
+
----
|
|
108
|
+
|
|
109
|
+
== Optimization Suggestions
|
|
110
|
+
|
|
111
|
+
=== Generate Improvement Recommendations
|
|
112
|
+
|
|
113
|
+
[source,ruby]
|
|
114
|
+
----
|
|
115
|
+
profiler = Omnizip::Profiler.new
|
|
116
|
+
|
|
117
|
+
# Run profiling
|
|
118
|
+
10.times do |i|
|
|
119
|
+
profiler.profile("iteration_#{i}") do
|
|
120
|
+
# Compression operations
|
|
121
|
+
end
|
|
122
|
+
end
|
|
123
|
+
|
|
124
|
+
# Generate suggestions
|
|
125
|
+
suggestions = profiler.generate_suggestions
|
|
126
|
+
|
|
127
|
+
suggestions.each do |suggestion|
|
|
128
|
+
puts "\n#{suggestion.title}"
|
|
129
|
+
puts " #{suggestion.description}"
|
|
130
|
+
puts " Severity: #{suggestion.severity}"
|
|
131
|
+
puts " Category: #{suggestion.category}"
|
|
132
|
+
puts " Estimated impact: #{(suggestion.impact_estimate * 100).round(1)}%"
|
|
133
|
+
|
|
134
|
+
if suggestion.recommendation
|
|
135
|
+
puts " Recommendation: #{suggestion.recommendation}"
|
|
136
|
+
end
|
|
137
|
+
end
|
|
138
|
+
----
|
|
139
|
+
|
|
140
|
+
== Profiling Reports
|
|
141
|
+
|
|
142
|
+
=== Generate Detailed Reports
|
|
143
|
+
|
|
144
|
+
[source,ruby]
|
|
145
|
+
----
|
|
146
|
+
profiler = Omnizip::Profiler.new(profile_name: "BZip2 Compression")
|
|
147
|
+
|
|
148
|
+
# Profile operations
|
|
149
|
+
profiler.profile("initialization") { setup_compressor }
|
|
150
|
+
profiler.profile("bwt_transform") { bwt.transform(data) }
|
|
151
|
+
profiler.profile("mtf_encoding") { mtf.encode(transformed) }
|
|
152
|
+
profiler.profile("huffman_coding") { huffman.encode(encoded) }
|
|
153
|
+
profiler.profile("finalization") { write_output }
|
|
154
|
+
|
|
155
|
+
# Get detailed report
|
|
156
|
+
report = profiler.report
|
|
157
|
+
|
|
158
|
+
puts "=== Profiling Report: #{report.profile_name} ==="
|
|
159
|
+
puts "\nTotal execution time: #{report.total_execution_time}s"
|
|
160
|
+
puts "Total memory allocated: #{report.total_memory_allocated} bytes"
|
|
161
|
+
puts "Operations profiled: #{report.results.size}"
|
|
162
|
+
|
|
163
|
+
puts "\n=== Slowest Operations ==="
|
|
164
|
+
report.slowest_operations(limit: 3).each do |op|
|
|
165
|
+
percentage = (op.total_time / report.total_execution_time * 100).round(1)
|
|
166
|
+
puts " #{op.operation_name}: #{op.total_time}s (#{percentage}%)"
|
|
167
|
+
end
|
|
168
|
+
|
|
169
|
+
puts "\n=== Memory Intensive Operations ==="
|
|
170
|
+
report.memory_intensive_operations(limit: 3).each do |op|
|
|
171
|
+
mb = (op.memory_allocated.to_f / (1024 * 1024)).round(2)
|
|
172
|
+
puts " #{op.operation_name}: #{mb}MB"
|
|
173
|
+
end
|
|
174
|
+
----
|
|
175
|
+
|
|
176
|
+
== Method Profiling
|
|
177
|
+
|
|
178
|
+
=== Profile Specific Methods
|
|
179
|
+
|
|
180
|
+
[source,ruby]
|
|
181
|
+
----
|
|
182
|
+
profiler = Omnizip::Profiler.new
|
|
183
|
+
|
|
184
|
+
# Register method profiler
|
|
185
|
+
method_profiler = Omnizip::Profiler::MethodProfiler.new
|
|
186
|
+
profiler.register_profiler(:method, method_profiler)
|
|
187
|
+
|
|
188
|
+
# Profile method calls
|
|
189
|
+
algorithm = Omnizip::AlgorithmRegistry.get(:lzma).new
|
|
190
|
+
profiler.profile_method(algorithm, :compress, input, output)
|
|
191
|
+
|
|
192
|
+
# Check results
|
|
193
|
+
results = profiler.report.results.find { |r| r.operation_name.include?('compress') }
|
|
194
|
+
puts "Method calls: #{results.call_count}"
|
|
195
|
+
puts "Average time per call: #{results.average_time}s"
|
|
196
|
+
----
|
|
197
|
+
|
|
198
|
+
== Memory Profiling
|
|
199
|
+
|
|
200
|
+
=== Track Memory Allocation
|
|
201
|
+
|
|
202
|
+
[source,ruby]
|
|
203
|
+
----
|
|
204
|
+
profiler = Omnizip::Profiler.new
|
|
205
|
+
|
|
206
|
+
# Register memory profiler
|
|
207
|
+
memory_profiler = Omnizip::Profiler::MemoryProfiler.new
|
|
208
|
+
profiler.register_profiler(:memory, memory_profiler)
|
|
209
|
+
|
|
210
|
+
# Profile with memory tracking
|
|
211
|
+
profiler.profile("data_processing", profiler_type: :memory) do
|
|
212
|
+
data = Array.new(1_000_000) { rand }
|
|
213
|
+
data.map { |x| x * 2 }
|
|
214
|
+
end
|
|
215
|
+
|
|
216
|
+
# Check memory usage
|
|
217
|
+
report = profiler.report
|
|
218
|
+
puts "Memory allocated: #{report.total_memory_allocated} bytes"
|
|
219
|
+
puts "Memory retained: #{report.total_memory_retained} bytes"
|
|
220
|
+
----
|
|
221
|
+
|
|
222
|
+
== Examples
|
|
223
|
+
|
|
224
|
+
=== Example 1: Find Compression Bottleneck
|
|
225
|
+
|
|
226
|
+
[source,ruby]
|
|
227
|
+
----
|
|
228
|
+
def profile_compression(file_path)
|
|
229
|
+
profiler = Omnizip::Profiler.new(profile_name: "Compression Analysis")
|
|
230
|
+
|
|
231
|
+
# Profile each stage
|
|
232
|
+
data = profiler.profile("read_input") do
|
|
233
|
+
File.read(file_path)
|
|
234
|
+
end
|
|
235
|
+
|
|
236
|
+
compressed = profiler.profile("compress") do
|
|
237
|
+
algorithm = Omnizip::AlgorithmRegistry.get(:bzip2).new(level: 9)
|
|
238
|
+
output = StringIO.new
|
|
239
|
+
algorithm.compress(StringIO.new(data), output)
|
|
240
|
+
output.string
|
|
241
|
+
end
|
|
242
|
+
|
|
243
|
+
profiler.profile("write_output") do
|
|
244
|
+
File.write("#{file_path}.bz2", compressed)
|
|
245
|
+
end
|
|
246
|
+
|
|
247
|
+
# Analyze results
|
|
248
|
+
report = profiler.report
|
|
249
|
+
puts "\n=== Compression Profile ==="
|
|
250
|
+
puts "Total time: #{report.total_execution_time}s"
|
|
251
|
+
|
|
252
|
+
report.results.each do |result|
|
|
253
|
+
percentage = (result.total_time / report.total_execution_time * 100).round(1)
|
|
254
|
+
puts "#{result.operation_name}: #{result.total_time}s (#{percentage}%)"
|
|
255
|
+
end
|
|
256
|
+
|
|
257
|
+
# Generate optimization suggestions
|
|
258
|
+
suggestions = profiler.generate_suggestions
|
|
259
|
+
if suggestions.any?
|
|
260
|
+
puts "\n=== Optimization Suggestions ==="
|
|
261
|
+
suggestions.first(3).each { |s| puts "- #{s.title}" }
|
|
262
|
+
end
|
|
263
|
+
end
|
|
264
|
+
|
|
265
|
+
profile_compression('large_file.txt')
|
|
266
|
+
----
|
|
267
|
+
|
|
268
|
+
=== Example 2: Compare Algorithm Performance
|
|
269
|
+
|
|
270
|
+
[source,ruby]
|
|
271
|
+
----
|
|
272
|
+
def compare_algorithms(data, algorithms)
|
|
273
|
+
results = {}
|
|
274
|
+
|
|
275
|
+
algorithms.each do |algo_name|
|
|
276
|
+
profiler = Omnizip::Profiler.new(profile_name: algo_name.to_s)
|
|
277
|
+
|
|
278
|
+
compressed_size = profiler.profile("compression") do
|
|
279
|
+
algorithm = Omnizip::AlgorithmRegistry.get(algo_name).new(level: 6)
|
|
280
|
+
output = StringIO.new
|
|
281
|
+
algorithm.compress(StringIO.new(data), output)
|
|
282
|
+
output.size
|
|
283
|
+
end
|
|
284
|
+
|
|
285
|
+
results[algo_name] = {
|
|
286
|
+
time: profiler.report.total_execution_time,
|
|
287
|
+
size: compressed_size,
|
|
288
|
+
ratio: (1 - compressed_size.to_f / data.size) * 100
|
|
289
|
+
}
|
|
290
|
+
end
|
|
291
|
+
|
|
292
|
+
# Print comparison
|
|
293
|
+
puts "\n=== Algorithm Comparison ==="
|
|
294
|
+
puts "Original size: #{data.size} bytes\n\n"
|
|
295
|
+
|
|
296
|
+
results.sort_by { |_, v| v[:time] }.each do |algo, stats|
|
|
297
|
+
puts "#{algo}:"
|
|
298
|
+
puts " Time: #{stats[:time].round(3)}s"
|
|
299
|
+
puts " Size: #{stats[:size]} bytes"
|
|
300
|
+
puts " Ratio: #{stats[:ratio].round(1)}%"
|
|
301
|
+
end
|
|
302
|
+
end
|
|
303
|
+
|
|
304
|
+
data = File.read('test_file.dat')
|
|
305
|
+
compare_algorithms(data, [:deflate, :lzma, :bzip2, :zstd])
|
|
306
|
+
----
|
|
307
|
+
|
|
308
|
+
=== Example 3: Memory Leak Detection
|
|
309
|
+
|
|
310
|
+
[source,ruby]
|
|
311
|
+
----
|
|
312
|
+
def detect_memory_leaks(iterations = 100)
|
|
313
|
+
profiler = Omnizip::Profiler.new(profile_name: "Memory Leak Detection")
|
|
314
|
+
memory_profiler = Omnizip::Profiler::MemoryProfiler.new
|
|
315
|
+
profiler.register_profiler(:memory, memory_profiler)
|
|
316
|
+
|
|
317
|
+
baseline_memory = nil
|
|
318
|
+
|
|
319
|
+
iterations.times do |i|
|
|
320
|
+
profiler.profile("iteration_#{i}", profiler_type: :memory) do
|
|
321
|
+
# Suspect operation
|
|
322
|
+
data = Array.new(10_000) { rand }
|
|
323
|
+
compress_data(data)
|
|
324
|
+
end
|
|
325
|
+
|
|
326
|
+
current_memory = profiler.report.total_memory_allocated
|
|
327
|
+
|
|
328
|
+
if i == 0
|
|
329
|
+
baseline_memory = current_memory
|
|
330
|
+
elsif i % 10 == 0
|
|
331
|
+
growth = current_memory - baseline_memory
|
|
332
|
+
growth_rate = (growth.to_f / baseline_memory * 100).round(2)
|
|
333
|
+
|
|
334
|
+
puts "Iteration #{i}: #{growth} bytes growth (#{growth_rate}%)"
|
|
335
|
+
|
|
336
|
+
if growth_rate > 50
|
|
337
|
+
puts "⚠️ Potential memory leak detected!"
|
|
338
|
+
break
|
|
339
|
+
end
|
|
340
|
+
end
|
|
341
|
+
end
|
|
342
|
+
end
|
|
343
|
+
|
|
344
|
+
detect_memory_leaks
|
|
345
|
+
----
|
|
346
|
+
|
|
347
|
+
== Profiler Configuration
|
|
348
|
+
|
|
349
|
+
=== Enable/Disable Profiling
|
|
350
|
+
|
|
351
|
+
[source,ruby]
|
|
352
|
+
----
|
|
353
|
+
profiler = Omnizip::Profiler.new
|
|
354
|
+
|
|
355
|
+
# Disable profiling (no overhead)
|
|
356
|
+
profiler.disable!
|
|
357
|
+
|
|
358
|
+
# Operations run without profiling
|
|
359
|
+
profiler.profile("operation") { slow_operation } # Not profiled
|
|
360
|
+
|
|
361
|
+
# Re-enable profiling
|
|
362
|
+
profiler.enable!
|
|
363
|
+
|
|
364
|
+
# Operations are now profiled again
|
|
365
|
+
profiler.profile("operation") { slow_operation } # Profiled
|
|
366
|
+
----
|
|
367
|
+
|
|
368
|
+
=== Reset Profiler State
|
|
369
|
+
|
|
370
|
+
[source,ruby]
|
|
371
|
+
----
|
|
372
|
+
profiler = Omnizip::Profiler.new
|
|
373
|
+
|
|
374
|
+
# Collect some data
|
|
375
|
+
profiler.profile("op1") { operation1 }
|
|
376
|
+
profiler.profile("op2") { operation2 }
|
|
377
|
+
|
|
378
|
+
# Reset profiler
|
|
379
|
+
profiler.reset!
|
|
380
|
+
|
|
381
|
+
# Start fresh profiling
|
|
382
|
+
profiler.profile("op3") { operation3 } # Previous data cleared
|
|
383
|
+
----
|
|
384
|
+
|
|
385
|
+
== See Also
|
|
386
|
+
|
|
387
|
+
* link:../README.adoc#performance[Performance Analysis]
|
|
388
|
+
* link:compression-profiles.adoc[Compression Profiles]
|
|
389
|
+
* link:advanced-features.adoc[Advanced Features]
|
|
@@ -0,0 +1,280 @@
|
|
|
1
|
+
= Preprocessing Filters Guide
|
|
2
|
+
:toc:
|
|
3
|
+
:toclevels: 3
|
|
4
|
+
|
|
5
|
+
== Purpose
|
|
6
|
+
|
|
7
|
+
This document covers preprocessing filters that improve compression of specific data types, particularly executable files and multimedia data.
|
|
8
|
+
|
|
9
|
+
== Supported Filters
|
|
10
|
+
|
|
11
|
+
[cols="20,15,65",options="header"]
|
|
12
|
+
|===
|
|
13
|
+
|Filter |ID |Description
|
|
14
|
+
|
|
15
|
+
|BCJ x86 |0x04 |Branch conversion for x86 executables
|
|
16
|
+
|BCJ2 |0x0303011B |Advanced 4-stream x86 filter
|
|
17
|
+
|BCJ ARM |0x05 |ARM executable filter
|
|
18
|
+
|BCJ ARM64 |0x0A |ARM64/AArch64 filter
|
|
19
|
+
|BCJ PPC |0x07 |PowerPC filter
|
|
20
|
+
|BCJ IA-64 |0x06 |Itanium filter
|
|
21
|
+
|BCJ SPARC |0x08 |SPARC filter
|
|
22
|
+
|Delta |0x03 |Delta encoding (configurable distance)
|
|
23
|
+
|===
|
|
24
|
+
|
|
25
|
+
== BCJ (Branch-Call-Jump) Filters
|
|
26
|
+
|
|
27
|
+
=== General
|
|
28
|
+
|
|
29
|
+
Branch-Call-Jump filters improve compression of executable files by converting relative addresses to absolute addresses. This transformation makes the data more compressible because branch instructions share common patterns.
|
|
30
|
+
|
|
31
|
+
=== Supported Architectures
|
|
32
|
+
|
|
33
|
+
* **BCJ x86** - Intel/AMD x86 (32-bit and 64-bit)
|
|
34
|
+
* **BCJ ARM** - ARM 32-bit executables
|
|
35
|
+
* **BCJ ARM64** - ARM 64-bit (AArch64) executables
|
|
36
|
+
* **BCJ PPC** - PowerPC executables
|
|
37
|
+
* **BCJ SPARC** - SPARC executables
|
|
38
|
+
* **BCJ IA-64** - Intel Itanium executables
|
|
39
|
+
|
|
40
|
+
=== How It Works
|
|
41
|
+
|
|
42
|
+
. Scans binary code for branch/call instructions
|
|
43
|
+
. Converts relative offsets to absolute addresses
|
|
44
|
+
. Makes patterns more regular and compressible
|
|
45
|
+
. Decoder reverses the transformation after decompression
|
|
46
|
+
|
|
47
|
+
=== Usage
|
|
48
|
+
|
|
49
|
+
[source,ruby]
|
|
50
|
+
----
|
|
51
|
+
# Use with filter pipeline
|
|
52
|
+
pipeline = Omnizip::FilterPipeline.new
|
|
53
|
+
pipeline.add_filter(:bcj_x86) # For x86 executables
|
|
54
|
+
|
|
55
|
+
# Apply before compression
|
|
56
|
+
filtered_data = pipeline.encode(executable_data)
|
|
57
|
+
algorithm.compress(StringIO.new(filtered_data), output)
|
|
58
|
+
|
|
59
|
+
# Different architectures
|
|
60
|
+
pipeline_arm = Omnizip::FilterPipeline.new
|
|
61
|
+
pipeline_arm.add_filter(:bcj_arm)
|
|
62
|
+
|
|
63
|
+
pipeline_arm64 = Omnizip::FilterPipeline.new
|
|
64
|
+
pipeline_arm64.add_filter(:bcj_arm64)
|
|
65
|
+
----
|
|
66
|
+
|
|
67
|
+
=== Typical Improvements
|
|
68
|
+
|
|
69
|
+
Using BCJ filters on executable files typically improves compression by:
|
|
70
|
+
|
|
71
|
+
* **10-30%** for x86/x64 executables
|
|
72
|
+
* **15-35%** for ARM executables
|
|
73
|
+
* **20-40%** for stripped executables (no debug symbols)
|
|
74
|
+
|
|
75
|
+
== BCJ2 Filter
|
|
76
|
+
|
|
77
|
+
=== General
|
|
78
|
+
|
|
79
|
+
BCJ2 provides advanced 4-stream filtering for x86 code, achieving better compression than standard BCJ. It splits the filtered data into four separate streams that can be compressed independently.
|
|
80
|
+
|
|
81
|
+
=== How It Works
|
|
82
|
+
|
|
83
|
+
. Analyzes x86 code structure
|
|
84
|
+
. Splits into 4 streams:
|
|
85
|
+
* Main stream (data and unprocessed bytes)
|
|
86
|
+
* Call stream (call instruction targets)
|
|
87
|
+
* Jump stream (jump instruction targets)
|
|
88
|
+
* Common stream (shared data)
|
|
89
|
+
. Each stream is compressed separately
|
|
90
|
+
. Decoder merges streams during decompression
|
|
91
|
+
|
|
92
|
+
=== Usage
|
|
93
|
+
|
|
94
|
+
[source,ruby]
|
|
95
|
+
----
|
|
96
|
+
filter = Omnizip::FilterRegistry.get(:bcj2).new
|
|
97
|
+
encoded_streams = filter.encode(x86_code)
|
|
98
|
+
# Returns 4 separate streams for optimal compression
|
|
99
|
+
|
|
100
|
+
# Compress each stream
|
|
101
|
+
encoded_streams.each_with_index do |stream, idx|
|
|
102
|
+
algorithm.compress(StringIO.new(stream), output_files[idx])
|
|
103
|
+
end
|
|
104
|
+
----
|
|
105
|
+
|
|
106
|
+
=== When to Use BCJ2
|
|
107
|
+
|
|
108
|
+
**Use BCJ2 when:**
|
|
109
|
+
|
|
110
|
+
* Maximum compression is needed
|
|
111
|
+
* Processing large x86 executables
|
|
112
|
+
* Archive size is critical (software distribution)
|
|
113
|
+
|
|
114
|
+
**Use standard BCJ when:**
|
|
115
|
+
|
|
116
|
+
* Simplicity is preferred
|
|
117
|
+
* Working with mixed content
|
|
118
|
+
* Speed is more important than maximum compression
|
|
119
|
+
|
|
120
|
+
=== Typical Improvements
|
|
121
|
+
|
|
122
|
+
BCJ2 typically achieves:
|
|
123
|
+
|
|
124
|
+
* **5-10% better** compression than standard BCJ
|
|
125
|
+
* **15-40%** better than no filter
|
|
126
|
+
* Best results on large executables (> 1MB)
|
|
127
|
+
|
|
128
|
+
== Delta Filter
|
|
129
|
+
|
|
130
|
+
=== General
|
|
131
|
+
|
|
132
|
+
Delta encoding is effective for multimedia files and time-series data where consecutive bytes have small differences. It transforms absolute values into differences between consecutive values.
|
|
133
|
+
|
|
134
|
+
=== How It Works
|
|
135
|
+
|
|
136
|
+
. Computes differences between consecutive bytes
|
|
137
|
+
. Stores deltas instead of absolute values
|
|
138
|
+
. Makes nearly constant data highly compressible
|
|
139
|
+
. Configurable distance parameter for pattern matching
|
|
140
|
+
|
|
141
|
+
=== Configuration
|
|
142
|
+
|
|
143
|
+
The `distance` parameter determines the stride:
|
|
144
|
+
|
|
145
|
+
* **distance=1:** Byte-by-byte differences (images, audio)
|
|
146
|
+
* **distance=2:** 16-bit word differences
|
|
147
|
+
* **distance=4:** 32-bit word differences
|
|
148
|
+
* **distance=N:** Custom stride
|
|
149
|
+
|
|
150
|
+
=== Usage
|
|
151
|
+
|
|
152
|
+
[source,ruby]
|
|
153
|
+
----
|
|
154
|
+
# Basic delta filter (distance=1)
|
|
155
|
+
filter = Omnizip::FilterRegistry.get(:delta).new(distance: 1)
|
|
156
|
+
filtered = filter.encode(audio_data)
|
|
157
|
+
|
|
158
|
+
# For 16-bit audio samples
|
|
159
|
+
filter_16bit = Omnizip::FilterRegistry.get(:delta).new(distance: 2)
|
|
160
|
+
filtered_audio = filter_16bit.encode(audio_samples)
|
|
161
|
+
|
|
162
|
+
# For database dumps with aligned records
|
|
163
|
+
filter_db = Omnizip::FilterRegistry.get(:delta).new(distance: 4)
|
|
164
|
+
filtered_db = filter_db.encode(database_dump)
|
|
165
|
+
----
|
|
166
|
+
|
|
167
|
+
=== Best Use Cases
|
|
168
|
+
|
|
169
|
+
**Excellent for:**
|
|
170
|
+
|
|
171
|
+
* Uncompressed audio (WAV, raw PCM)
|
|
172
|
+
* Time-series sensor data
|
|
173
|
+
* Database dumps with sequential indices
|
|
174
|
+
* Bitmap images (BMP, uncompressed TIFF)
|
|
175
|
+
|
|
176
|
+
**Not suitable for:**
|
|
177
|
+
|
|
178
|
+
* Already compressed data (MP3, JPEG)
|
|
179
|
+
* Random data
|
|
180
|
+
* Text files
|
|
181
|
+
* Encrypted data
|
|
182
|
+
|
|
183
|
+
=== Typical Improvements
|
|
184
|
+
|
|
185
|
+
Delta filter can achieve:
|
|
186
|
+
|
|
187
|
+
* **50-80%** better compression on audio waveforms
|
|
188
|
+
* **30-60%** better on time-series data
|
|
189
|
+
* **20-40%** better on bitmap images
|
|
190
|
+
* **Minimal improvement** on text or compressed data
|
|
191
|
+
|
|
192
|
+
== Filter Chaining
|
|
193
|
+
|
|
194
|
+
=== General
|
|
195
|
+
|
|
196
|
+
Multiple filters can be chained together for optimal compression. The order matters - apply filters in sequence that makes data progressively more compressible.
|
|
197
|
+
|
|
198
|
+
=== Usage
|
|
199
|
+
|
|
200
|
+
[source,ruby]
|
|
201
|
+
----
|
|
202
|
+
# Chain multiple filters
|
|
203
|
+
pipeline = Omnizip::FilterPipeline.new
|
|
204
|
+
pipeline.add_filter(:bcj_x86) # First: convert branches
|
|
205
|
+
pipeline.add_filter(:delta, distance: 1) # Then: delta encode
|
|
206
|
+
|
|
207
|
+
# Apply chain
|
|
208
|
+
filtered_data = pipeline.encode(executable_data)
|
|
209
|
+
algorithm.compress(StringIO.new(filtered_data), output)
|
|
210
|
+
----
|
|
211
|
+
|
|
212
|
+
=== Recommended Filter Chains
|
|
213
|
+
|
|
214
|
+
**For x86 executables:**
|
|
215
|
+
```
|
|
216
|
+
BCJ x86 → LZMA2
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
**For ARM executables:**
|
|
220
|
+
```
|
|
221
|
+
BCJ ARM → LZMA2
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
**For large x86 binaries (maximum compression):**
|
|
225
|
+
```
|
|
226
|
+
BCJ2 (4 streams) → LZMA2 (each stream)
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
**For uncompressed audio:**
|
|
230
|
+
```
|
|
231
|
+
Delta (distance=2) → LZMA2 or BZip2
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
**For bitmap images:**
|
|
235
|
+
```
|
|
236
|
+
Delta (distance=1) → LZMA2
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
== Performance Considerations
|
|
240
|
+
|
|
241
|
+
=== Processing Overhead
|
|
242
|
+
|
|
243
|
+
Filters add minimal overhead:
|
|
244
|
+
|
|
245
|
+
* **BCJ filters:** < 5% processing time
|
|
246
|
+
* **Delta filter:** < 3% processing time
|
|
247
|
+
* **BCJ2:** 10-15% processing time (4-stream handling)
|
|
248
|
+
|
|
249
|
+
The compression gains far outweigh the processing cost.
|
|
250
|
+
|
|
251
|
+
=== Memory Usage
|
|
252
|
+
|
|
253
|
+
* **BCJ filters:** Minimal (< 1MB)
|
|
254
|
+
* **Delta filter:** Minimal (< 1MB)
|
|
255
|
+
* **BCJ2:** Moderate (needs buffer for 4 streams)
|
|
256
|
+
|
|
257
|
+
== Integration with .7z Archives
|
|
258
|
+
|
|
259
|
+
Filters are automatically applied when using .7z archives:
|
|
260
|
+
|
|
261
|
+
[source,ruby]
|
|
262
|
+
----
|
|
263
|
+
# Create .7z with BCJ filter
|
|
264
|
+
writer = Omnizip::Formats::SevenZip::Writer.new('programs.7z')
|
|
265
|
+
writer.add_filter(:bcj_x86) # Applies to all files
|
|
266
|
+
writer.add_file('program.exe')
|
|
267
|
+
writer.close
|
|
268
|
+
|
|
269
|
+
# Filter is automatically applied during extraction
|
|
270
|
+
reader = Omnizip::Formats::SevenZip::Reader.new('programs.7z')
|
|
271
|
+
reader.extract_all('output/') # BCJ filter automatically reversed
|
|
272
|
+
reader.close
|
|
273
|
+
----
|
|
274
|
+
|
|
275
|
+
== See Also
|
|
276
|
+
|
|
277
|
+
* link:compression-algorithms.adoc[Compression Algorithms]
|
|
278
|
+
* link:api-usage.adoc[Library API Usage]
|
|
279
|
+
* link:archive-formats.adoc[Archive Formats]
|
|
280
|
+
* link:../README.adoc[Main README]
|