cabriolet 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/ARCHITECTURE.md +799 -0
- data/CHANGELOG.md +44 -0
- data/LICENSE +29 -0
- data/README.adoc +1207 -0
- data/exe/cabriolet +6 -0
- data/lib/cabriolet/auto.rb +173 -0
- data/lib/cabriolet/binary/bitstream.rb +148 -0
- data/lib/cabriolet/binary/bitstream_writer.rb +180 -0
- data/lib/cabriolet/binary/chm_structures.rb +213 -0
- data/lib/cabriolet/binary/hlp_structures.rb +66 -0
- data/lib/cabriolet/binary/kwaj_structures.rb +74 -0
- data/lib/cabriolet/binary/lit_structures.rb +107 -0
- data/lib/cabriolet/binary/oab_structures.rb +112 -0
- data/lib/cabriolet/binary/structures.rb +56 -0
- data/lib/cabriolet/binary/szdd_structures.rb +60 -0
- data/lib/cabriolet/cab/compressor.rb +382 -0
- data/lib/cabriolet/cab/decompressor.rb +510 -0
- data/lib/cabriolet/cab/extractor.rb +357 -0
- data/lib/cabriolet/cab/parser.rb +264 -0
- data/lib/cabriolet/chm/compressor.rb +513 -0
- data/lib/cabriolet/chm/decompressor.rb +436 -0
- data/lib/cabriolet/chm/parser.rb +254 -0
- data/lib/cabriolet/cli.rb +776 -0
- data/lib/cabriolet/compressors/base.rb +34 -0
- data/lib/cabriolet/compressors/lzss.rb +250 -0
- data/lib/cabriolet/compressors/lzx.rb +581 -0
- data/lib/cabriolet/compressors/mszip.rb +315 -0
- data/lib/cabriolet/compressors/quantum.rb +446 -0
- data/lib/cabriolet/constants.rb +75 -0
- data/lib/cabriolet/decompressors/base.rb +39 -0
- data/lib/cabriolet/decompressors/lzss.rb +138 -0
- data/lib/cabriolet/decompressors/lzx.rb +726 -0
- data/lib/cabriolet/decompressors/mszip.rb +390 -0
- data/lib/cabriolet/decompressors/none.rb +27 -0
- data/lib/cabriolet/decompressors/quantum.rb +456 -0
- data/lib/cabriolet/errors.rb +39 -0
- data/lib/cabriolet/format_detector.rb +156 -0
- data/lib/cabriolet/hlp/compressor.rb +272 -0
- data/lib/cabriolet/hlp/decompressor.rb +198 -0
- data/lib/cabriolet/hlp/parser.rb +131 -0
- data/lib/cabriolet/huffman/decoder.rb +79 -0
- data/lib/cabriolet/huffman/encoder.rb +108 -0
- data/lib/cabriolet/huffman/tree.rb +138 -0
- data/lib/cabriolet/kwaj/compressor.rb +479 -0
- data/lib/cabriolet/kwaj/decompressor.rb +237 -0
- data/lib/cabriolet/kwaj/parser.rb +183 -0
- data/lib/cabriolet/lit/compressor.rb +255 -0
- data/lib/cabriolet/lit/decompressor.rb +250 -0
- data/lib/cabriolet/models/cabinet.rb +81 -0
- data/lib/cabriolet/models/chm_file.rb +28 -0
- data/lib/cabriolet/models/chm_header.rb +67 -0
- data/lib/cabriolet/models/chm_section.rb +38 -0
- data/lib/cabriolet/models/file.rb +119 -0
- data/lib/cabriolet/models/folder.rb +102 -0
- data/lib/cabriolet/models/folder_data.rb +21 -0
- data/lib/cabriolet/models/hlp_file.rb +45 -0
- data/lib/cabriolet/models/hlp_header.rb +37 -0
- data/lib/cabriolet/models/kwaj_header.rb +98 -0
- data/lib/cabriolet/models/lit_header.rb +55 -0
- data/lib/cabriolet/models/oab_header.rb +95 -0
- data/lib/cabriolet/models/szdd_header.rb +72 -0
- data/lib/cabriolet/modifier.rb +326 -0
- data/lib/cabriolet/oab/compressor.rb +353 -0
- data/lib/cabriolet/oab/decompressor.rb +315 -0
- data/lib/cabriolet/parallel.rb +333 -0
- data/lib/cabriolet/repairer.rb +288 -0
- data/lib/cabriolet/streaming.rb +221 -0
- data/lib/cabriolet/system/file_handle.rb +107 -0
- data/lib/cabriolet/system/io_system.rb +87 -0
- data/lib/cabriolet/system/memory_handle.rb +105 -0
- data/lib/cabriolet/szdd/compressor.rb +217 -0
- data/lib/cabriolet/szdd/decompressor.rb +184 -0
- data/lib/cabriolet/szdd/parser.rb +127 -0
- data/lib/cabriolet/validator.rb +332 -0
- data/lib/cabriolet/version.rb +5 -0
- data/lib/cabriolet.rb +104 -0
- metadata +157 -0
data/README.adoc
ADDED
|
@@ -0,0 +1,1207 @@
|
|
|
1
|
+
= Cabriolet
|
|
2
|
+
:toc: left
|
|
3
|
+
:toclevels: 3
|
|
4
|
+
|
|
5
|
+
image:https://img.shields.io/gem/v/cabriolet.svg[RubyGems Version, link=https://rubygems.org/gems/cabriolet]
|
|
6
|
+
image:https://img.shields.io/github/license/omnizip/cabriolet.svg[License]
|
|
7
|
+
|
|
8
|
+
Pure Ruby implementation for extracting and creating Microsoft compression
|
|
9
|
+
format files.
|
|
10
|
+
|
|
11
|
+
== Introduction
|
|
12
|
+
|
|
13
|
+
Cabriolet extracts and creates Microsoft Cabinet (.CAB) files and related
|
|
14
|
+
compression formats using pure Ruby.
|
|
15
|
+
|
|
16
|
+
This gem fully covers the features of libmspack and cabextract, implementing all
|
|
17
|
+
Microsoft compression formats for both extraction (decompression) and creation
|
|
18
|
+
(compression).
|
|
19
|
+
|
|
20
|
+
NOTE: No C extensions required, works on any platform where Ruby runs.
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
=== Features
|
|
24
|
+
|
|
25
|
+
* **Full format support** for all 7 Microsoft compression formats
|
|
26
|
+
** CAB (Microsoft Cabinet)
|
|
27
|
+
** CHM (Compiled HTML Help)
|
|
28
|
+
** SZDD (Single-file LZSS compression)
|
|
29
|
+
** KWAJ (Installation file compression)
|
|
30
|
+
** HLP (Windows Help)
|
|
31
|
+
** LIT (Microsoft Reader eBooks)
|
|
32
|
+
** OAB (Offline Address Book)
|
|
33
|
+
|
|
34
|
+
* **Bidirectional operations** (compress and decompress)
|
|
35
|
+
* **All compression algorithms**
|
|
36
|
+
** None (uncompressed storage)
|
|
37
|
+
** LZSS (4KB sliding window, 3 modes)
|
|
38
|
+
** MSZIP (DEFLATE/RFC 1951)
|
|
39
|
+
** LZX (advanced with Intel E8 preprocessing)
|
|
40
|
+
** Quantum (adaptive arithmetic coding)
|
|
41
|
+
|
|
42
|
+
* **Advanced features**
|
|
43
|
+
** Multi-part cabinet sets (spanning, merging)
|
|
44
|
+
** Embedded cabinet search
|
|
45
|
+
** Salvage mode for corrupted files
|
|
46
|
+
** Custom I/O handlers
|
|
47
|
+
** Progress callbacks
|
|
48
|
+
** Checksum verification
|
|
49
|
+
** Metadata preservation (timestamps, attributes)
|
|
50
|
+
|
|
51
|
+
* **Pure Ruby** - No compilation needed, works everywhere
|
|
52
|
+
* **Comprehensive testing** - 914 test examples, 0 failures
|
|
53
|
+
* **Complete CLI** - 30+ commands for all operations
|
|
54
|
+
|
|
55
|
+
=== Architecture
|
|
56
|
+
|
|
57
|
+
.High-level architecture
|
|
58
|
+
[source]
|
|
59
|
+
----
|
|
60
|
+
Application Layer (CLI/API)
|
|
61
|
+
↓
|
|
62
|
+
Format Layer (CAB, CHM, SZDD, KWAJ, HLP, LIT, OAB)
|
|
63
|
+
↓
|
|
64
|
+
Algorithm Layer (None, LZSS, MSZIP, LZX, Quantum)
|
|
65
|
+
↓
|
|
66
|
+
Binary I/O Layer (BinData structures, Bitstreams)
|
|
67
|
+
↓
|
|
68
|
+
System Layer (I/O abstraction, file/memory handles)
|
|
69
|
+
----
|
|
70
|
+
|
|
71
|
+
For complete architecture, see link:ARCHITECTURE.md[Architecture Documentation].
|
|
72
|
+
|
|
73
|
+
== Installation
|
|
74
|
+
|
|
75
|
+
Add to your Gemfile:
|
|
76
|
+
|
|
77
|
+
[source,ruby]
|
|
78
|
+
----
|
|
79
|
+
gem "cabriolet"
|
|
80
|
+
----
|
|
81
|
+
|
|
82
|
+
Or install directly:
|
|
83
|
+
|
|
84
|
+
[source,shell]
|
|
85
|
+
----
|
|
86
|
+
gem install cabriolet
|
|
87
|
+
----
|
|
88
|
+
|
|
89
|
+
For detailed installation instructions, see
|
|
90
|
+
link:docs/getting-started/installation.adoc[Installation Guide].
|
|
91
|
+
|
|
92
|
+
== System requirements
|
|
93
|
+
|
|
94
|
+
* Ruby 2.7 or higher
|
|
95
|
+
* Operating Systems: Linux, macOS, Windows
|
|
96
|
+
* Dependencies: bindata (~> 2.5), thor (~> 1.3)
|
|
97
|
+
|
|
98
|
+
== Usage
|
|
99
|
+
|
|
100
|
+
=== Command line interface
|
|
101
|
+
|
|
102
|
+
==== CAB (Cabinet) operations
|
|
103
|
+
|
|
104
|
+
===== List contents
|
|
105
|
+
|
|
106
|
+
[source,shell]
|
|
107
|
+
----
|
|
108
|
+
cabriolet list example.cab
|
|
109
|
+
----
|
|
110
|
+
|
|
111
|
+
.Example output
|
|
112
|
+
[example]
|
|
113
|
+
====
|
|
114
|
+
[source]
|
|
115
|
+
----
|
|
116
|
+
Cabinet: example.cab (Set ID: 12345, Index: 0)
|
|
117
|
+
Folders: 1, Files: 2
|
|
118
|
+
Files:
|
|
119
|
+
README.txt (1,234 bytes)
|
|
120
|
+
data.bin (45,678 bytes)
|
|
121
|
+
----
|
|
122
|
+
====
|
|
123
|
+
|
|
124
|
+
===== Extract all files
|
|
125
|
+
|
|
126
|
+
[source,shell]
|
|
127
|
+
----
|
|
128
|
+
cabriolet extract example.cab
|
|
129
|
+
----
|
|
130
|
+
|
|
131
|
+
===== Extract to specific directory
|
|
132
|
+
|
|
133
|
+
[source,shell]
|
|
134
|
+
----
|
|
135
|
+
cabriolet extract example.cab --output /path/to/output
|
|
136
|
+
----
|
|
137
|
+
|
|
138
|
+
===== Test cabinet integrity
|
|
139
|
+
|
|
140
|
+
[source,shell]
|
|
141
|
+
----
|
|
142
|
+
cabriolet test example.cab
|
|
143
|
+
----
|
|
144
|
+
|
|
145
|
+
===== Show detailed information
|
|
146
|
+
|
|
147
|
+
[source,shell]
|
|
148
|
+
----
|
|
149
|
+
cabriolet info example.cab
|
|
150
|
+
----
|
|
151
|
+
|
|
152
|
+
.Example output
|
|
153
|
+
[example]
|
|
154
|
+
====
|
|
155
|
+
[source]
|
|
156
|
+
----
|
|
157
|
+
Cabinet Information
|
|
158
|
+
==================================================
|
|
159
|
+
Filename: example.cab
|
|
160
|
+
Set ID: 12345
|
|
161
|
+
Set Index: 0
|
|
162
|
+
Size: 100,000 bytes
|
|
163
|
+
Folders: 2
|
|
164
|
+
Files: 15
|
|
165
|
+
|
|
166
|
+
Folders:
|
|
167
|
+
[0] MSZIP (5 blocks)
|
|
168
|
+
[1] LZX (3 blocks)
|
|
169
|
+
|
|
170
|
+
Files:
|
|
171
|
+
README.txt
|
|
172
|
+
Size: 1,234 bytes
|
|
173
|
+
Modified: 2024-01-15 10:30:00
|
|
174
|
+
Attributes: archive
|
|
175
|
+
...
|
|
176
|
+
----
|
|
177
|
+
====
|
|
178
|
+
|
|
179
|
+
===== Search for embedded CABs
|
|
180
|
+
|
|
181
|
+
[source,shell]
|
|
182
|
+
----
|
|
183
|
+
cabriolet search installer.exe --verbose
|
|
184
|
+
----
|
|
185
|
+
|
|
186
|
+
.Example output
|
|
187
|
+
[example]
|
|
188
|
+
====
|
|
189
|
+
[source]
|
|
190
|
+
----
|
|
191
|
+
Cabinet found at offset 1024
|
|
192
|
+
Files: 50, Folders: 1
|
|
193
|
+
Cabinet found at offset 524288
|
|
194
|
+
Files: 20, Folders: 1
|
|
195
|
+
|
|
196
|
+
Total: 2 cabinet(s) found
|
|
197
|
+
----
|
|
198
|
+
====
|
|
199
|
+
|
|
200
|
+
===== Create CAB file
|
|
201
|
+
|
|
202
|
+
[source,shell]
|
|
203
|
+
----
|
|
204
|
+
cabriolet create output.cab file1.txt file2.txt
|
|
205
|
+
cabriolet create output.cab *.txt --compression mszip
|
|
206
|
+
cabriolet create output.cab files/ --compression lzx
|
|
207
|
+
----
|
|
208
|
+
|
|
209
|
+
**Compression options**:
|
|
210
|
+
|
|
211
|
+
* `none` - Uncompressed storage
|
|
212
|
+
* `lzss` - LZSS compression (default for small files)
|
|
213
|
+
* `mszip` - MSZIP/DEFLATE compression (recommended)
|
|
214
|
+
* `lzx` - LZX compression (best ratio, slower)
|
|
215
|
+
* `quantum` - Quantum compression (experimental)
|
|
216
|
+
|
|
217
|
+
==== CHM (HTML Help) operations
|
|
218
|
+
|
|
219
|
+
===== List CHM contents
|
|
220
|
+
|
|
221
|
+
[source,shell]
|
|
222
|
+
----
|
|
223
|
+
cabriolet chm-list help.chm
|
|
224
|
+
----
|
|
225
|
+
|
|
226
|
+
===== Extract CHM files
|
|
227
|
+
|
|
228
|
+
[source,shell]
|
|
229
|
+
----
|
|
230
|
+
cabriolet chm-extract help.chm output/
|
|
231
|
+
----
|
|
232
|
+
|
|
233
|
+
===== Show CHM information
|
|
234
|
+
|
|
235
|
+
[source,shell]
|
|
236
|
+
----
|
|
237
|
+
cabriolet chm-info help.chm
|
|
238
|
+
----
|
|
239
|
+
|
|
240
|
+
===== Create CHM file
|
|
241
|
+
|
|
242
|
+
[source,shell]
|
|
243
|
+
----
|
|
244
|
+
cabriolet chm-create help.chm index.html page1.html page2.html
|
|
245
|
+
cabriolet chm-create help.chm docs/*.html --window-bits 16
|
|
246
|
+
----
|
|
247
|
+
|
|
248
|
+
**Options**:
|
|
249
|
+
|
|
250
|
+
* `--window-bits` - LZX window size (15-21, default: 16)
|
|
251
|
+
* `--verbose` - Enable verbose output
|
|
252
|
+
|
|
253
|
+
==== SZDD operations
|
|
254
|
+
|
|
255
|
+
===== Expand SZDD file
|
|
256
|
+
|
|
257
|
+
[source,shell]
|
|
258
|
+
----
|
|
259
|
+
cabriolet expand file.tx_
|
|
260
|
+
cabriolet expand file.tx_ output.txt
|
|
261
|
+
----
|
|
262
|
+
|
|
263
|
+
===== Compress to SZDD
|
|
264
|
+
|
|
265
|
+
[source,shell]
|
|
266
|
+
----
|
|
267
|
+
cabriolet compress file.txt
|
|
268
|
+
cabriolet compress file.txt --missing-char t
|
|
269
|
+
cabriolet compress file.txt --format qbasic
|
|
270
|
+
----
|
|
271
|
+
|
|
272
|
+
**Options**:
|
|
273
|
+
|
|
274
|
+
* `--missing-char` - Last character of original filename
|
|
275
|
+
* `--format` - Format type (`normal` or `qbasic`)
|
|
276
|
+
|
|
277
|
+
===== Show SZDD information
|
|
278
|
+
|
|
279
|
+
[source,shell]
|
|
280
|
+
----
|
|
281
|
+
cabriolet szdd-info file.tx_
|
|
282
|
+
----
|
|
283
|
+
|
|
284
|
+
==== KWAJ operations
|
|
285
|
+
|
|
286
|
+
===== Extract KWAJ file
|
|
287
|
+
|
|
288
|
+
[source,shell]
|
|
289
|
+
----
|
|
290
|
+
cabriolet kwaj-extract setup.kwj
|
|
291
|
+
cabriolet kwaj-extract setup.kwj output.exe
|
|
292
|
+
----
|
|
293
|
+
|
|
294
|
+
===== Compress to KWAJ
|
|
295
|
+
|
|
296
|
+
[source,shell]
|
|
297
|
+
----
|
|
298
|
+
cabriolet kwaj-compress file.exe
|
|
299
|
+
cabriolet kwaj-compress file.exe --compression szdd --include-length
|
|
300
|
+
cabriolet kwaj-compress file.exe --filename original.exe
|
|
301
|
+
----
|
|
302
|
+
|
|
303
|
+
**Compression options**:
|
|
304
|
+
|
|
305
|
+
* `none` - Uncompressed
|
|
306
|
+
* `xor` - XOR encryption (0xFF)
|
|
307
|
+
* `szdd` - LZSS compression (default)
|
|
308
|
+
* `mszip` - MSZIP compression
|
|
309
|
+
|
|
310
|
+
**Other options**:
|
|
311
|
+
|
|
312
|
+
* `--include-length` - Include uncompressed length in header
|
|
313
|
+
* `--filename` - Embed original filename
|
|
314
|
+
|
|
315
|
+
===== Show KWAJ information
|
|
316
|
+
|
|
317
|
+
[source,shell]
|
|
318
|
+
----
|
|
319
|
+
cabriolet kwaj-info setup.kwj
|
|
320
|
+
----
|
|
321
|
+
|
|
322
|
+
==== HLP (Windows Help) operations
|
|
323
|
+
|
|
324
|
+
===== Extract HLP file
|
|
325
|
+
|
|
326
|
+
[source,shell]
|
|
327
|
+
----
|
|
328
|
+
cabriolet hlp-extract help.hlp output/
|
|
329
|
+
----
|
|
330
|
+
|
|
331
|
+
===== Create HLP file
|
|
332
|
+
|
|
333
|
+
[source,shell]
|
|
334
|
+
----
|
|
335
|
+
cabriolet hlp-create output.hlp topic1.txt topic2.txt
|
|
336
|
+
----
|
|
337
|
+
|
|
338
|
+
===== Show HLP information
|
|
339
|
+
|
|
340
|
+
[source,shell]
|
|
341
|
+
----
|
|
342
|
+
cabriolet hlp-info help.hlp
|
|
343
|
+
----
|
|
344
|
+
|
|
345
|
+
==== LIT (eBook) operations
|
|
346
|
+
|
|
347
|
+
===== Extract LIT file
|
|
348
|
+
|
|
349
|
+
[source,shell]
|
|
350
|
+
----
|
|
351
|
+
cabriolet lit-extract book.lit output/
|
|
352
|
+
----
|
|
353
|
+
|
|
354
|
+
NOTE: DES-encrypted (DRM-protected) LIT files are not supported. For encrypted
|
|
355
|
+
files, use Microsoft Reader or convert to another format first.
|
|
356
|
+
|
|
357
|
+
===== Create LIT file
|
|
358
|
+
|
|
359
|
+
[source,shell]
|
|
360
|
+
----
|
|
361
|
+
cabriolet lit-create book.lit chapter1.html chapter2.html
|
|
362
|
+
----
|
|
363
|
+
|
|
364
|
+
===== Show LIT information
|
|
365
|
+
|
|
366
|
+
[source,shell]
|
|
367
|
+
----
|
|
368
|
+
cabriolet lit-info book.lit
|
|
369
|
+
----
|
|
370
|
+
|
|
371
|
+
==== OAB (Address Book) operations
|
|
372
|
+
|
|
373
|
+
===== Extract OAB file
|
|
374
|
+
|
|
375
|
+
[source,shell]
|
|
376
|
+
----
|
|
377
|
+
cabriolet oab-extract contacts.lzx output.oab
|
|
378
|
+
cabriolet oab-extract patch.lzx output.oab --base contacts.oab
|
|
379
|
+
----
|
|
380
|
+
|
|
381
|
+
**Options**:
|
|
382
|
+
|
|
383
|
+
* `--base` - Base file for incremental patch application
|
|
384
|
+
|
|
385
|
+
===== Create OAB file
|
|
386
|
+
|
|
387
|
+
[source,shell]
|
|
388
|
+
----
|
|
389
|
+
cabriolet oab-create contacts.oab output.lzx
|
|
390
|
+
cabriolet oab-create new.oab patch.lzx --base old.oab
|
|
391
|
+
----
|
|
392
|
+
|
|
393
|
+
**Options**:
|
|
394
|
+
|
|
395
|
+
* `--base` - Create incremental patch
|
|
396
|
+
* `--block-size` - LZX block size (default: 32768)
|
|
397
|
+
|
|
398
|
+
===== Show OAB information
|
|
399
|
+
|
|
400
|
+
[source,shell]
|
|
401
|
+
----
|
|
402
|
+
cabriolet oab-info contacts.lzx
|
|
403
|
+
----
|
|
404
|
+
|
|
405
|
+
==== Global Options
|
|
406
|
+
|
|
407
|
+
All commands support:
|
|
408
|
+
|
|
409
|
+
* `--verbose, -v` - Enable verbose output
|
|
410
|
+
* `--help, -h` - Show command help
|
|
411
|
+
|
|
412
|
+
=== Ruby API
|
|
413
|
+
|
|
414
|
+
==== CAB operations
|
|
415
|
+
|
|
416
|
+
===== Basic extraction
|
|
417
|
+
|
|
418
|
+
[source,ruby]
|
|
419
|
+
----
|
|
420
|
+
require "cabriolet"
|
|
421
|
+
|
|
422
|
+
# Open and extract
|
|
423
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
424
|
+
cabinet = decompressor.open("example.cab")
|
|
425
|
+
|
|
426
|
+
# List files
|
|
427
|
+
cabinet.files.each do |file|
|
|
428
|
+
puts "#{file.filename}: #{file.length} bytes"
|
|
429
|
+
end
|
|
430
|
+
|
|
431
|
+
# Extract single file
|
|
432
|
+
file = cabinet.files.first
|
|
433
|
+
decompressor.extract_file(file, "output.txt")
|
|
434
|
+
|
|
435
|
+
# Extract all files
|
|
436
|
+
decompressor.extract_all(cabinet, "output/")
|
|
437
|
+
----
|
|
438
|
+
|
|
439
|
+
===== Advanced extraction options
|
|
440
|
+
|
|
441
|
+
[source,ruby]
|
|
442
|
+
----
|
|
443
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
444
|
+
decompressor.salvage = true # Enable salvage mode
|
|
445
|
+
decompressor.fix_mszip = true # Enable MSZIP error recovery
|
|
446
|
+
decompressor.buffer_size = 8192 # Set buffer size
|
|
447
|
+
|
|
448
|
+
cabinet = decompressor.open("example.cab")
|
|
449
|
+
decompressor.extract_all(cabinet, "output/")
|
|
450
|
+
----
|
|
451
|
+
|
|
452
|
+
===== Multi-part cabinets
|
|
453
|
+
|
|
454
|
+
[source,ruby]
|
|
455
|
+
----
|
|
456
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
457
|
+
|
|
458
|
+
# Open first cabinet
|
|
459
|
+
cab1 = decompressor.open("disk1.cab")
|
|
460
|
+
|
|
461
|
+
# Open and append subsequent parts
|
|
462
|
+
cab2 = decompressor.open("disk2.cab")
|
|
463
|
+
decompressor.append(cab1, cab2)
|
|
464
|
+
|
|
465
|
+
cab3 = decompressor.open("disk3.cab")
|
|
466
|
+
decompressor.append(cab2, cab3)
|
|
467
|
+
|
|
468
|
+
# Extract from merged cabinet set
|
|
469
|
+
decompressor.extract_all(cab1, "output/")
|
|
470
|
+
----
|
|
471
|
+
|
|
472
|
+
===== Search for embedded cabinets
|
|
473
|
+
|
|
474
|
+
[source,ruby]
|
|
475
|
+
----
|
|
476
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
477
|
+
cabinet = decompressor.search("installer.exe")
|
|
478
|
+
|
|
479
|
+
while cabinet
|
|
480
|
+
puts "Cabinet at offset #{cabinet.base_offset}"
|
|
481
|
+
puts " Files: #{cabinet.file_count}"
|
|
482
|
+
|
|
483
|
+
# Extract this cabinet
|
|
484
|
+
decompressor.extract_all(cabinet, "output_#{cabinet.base_offset}/")
|
|
485
|
+
|
|
486
|
+
# Move to next found cabinet
|
|
487
|
+
cabinet = cabinet.next
|
|
488
|
+
end
|
|
489
|
+
----
|
|
490
|
+
|
|
491
|
+
===== Create CAB file
|
|
492
|
+
|
|
493
|
+
[source,ruby]
|
|
494
|
+
----
|
|
495
|
+
compressor = Cabriolet::CAB::Compressor.new
|
|
496
|
+
|
|
497
|
+
# Add files
|
|
498
|
+
compressor.add_file("README.txt")
|
|
499
|
+
compressor.add_file("data.bin", "custom/path.bin")
|
|
500
|
+
|
|
501
|
+
# Generate cabinet
|
|
502
|
+
bytes = compressor.generate("output.cab",
|
|
503
|
+
compression: :mszip,
|
|
504
|
+
set_id: 12345,
|
|
505
|
+
cabinet_index: 0)
|
|
506
|
+
|
|
507
|
+
puts "Created output.cab (#{bytes} bytes)"
|
|
508
|
+
----
|
|
509
|
+
|
|
510
|
+
**Compression options**:
|
|
511
|
+
|
|
512
|
+
* `:none` - No compression
|
|
513
|
+
* `:lzss` - LZSS compression
|
|
514
|
+
* `:mszip` - MSZIP/DEFLATE compression (recommended)
|
|
515
|
+
* `:lzx` - LZX compression (best ratio)
|
|
516
|
+
* `:quantum` - Quantum compression (experimental)
|
|
517
|
+
|
|
518
|
+
==== CHM operations
|
|
519
|
+
|
|
520
|
+
===== Extract CHM files
|
|
521
|
+
|
|
522
|
+
[source,ruby]
|
|
523
|
+
----
|
|
524
|
+
decompressor = Cabriolet::CHM::Decompressor.new
|
|
525
|
+
chm = decompressor.open("help.chm")
|
|
526
|
+
|
|
527
|
+
# List files
|
|
528
|
+
chm.files&.each do |file|
|
|
529
|
+
puts file.filename
|
|
530
|
+
end
|
|
531
|
+
|
|
532
|
+
# Extract single file
|
|
533
|
+
file = chm.files.first
|
|
534
|
+
decompressor.extract(file, "output.html") if file
|
|
535
|
+
|
|
536
|
+
# Extract all files
|
|
537
|
+
chm.files&.each do |file|
|
|
538
|
+
output_path = File.join("output", file.filename)
|
|
539
|
+
FileUtils.mkdir_p(File.dirname(output_path))
|
|
540
|
+
decompressor.extract(file, output_path)
|
|
541
|
+
end
|
|
542
|
+
----
|
|
543
|
+
|
|
544
|
+
===== Fast CHM parsing
|
|
545
|
+
|
|
546
|
+
[source,ruby]
|
|
547
|
+
----
|
|
548
|
+
decompressor = Cabriolet::CHM::Decompressor.new
|
|
549
|
+
|
|
550
|
+
# Quick open (headers only, no file enumeration)
|
|
551
|
+
chm = decompressor.fast_open("help.chm")
|
|
552
|
+
|
|
553
|
+
# Find specific file quickly
|
|
554
|
+
file = Models::CHMFile.new
|
|
555
|
+
result = decompressor.fast_find(chm, "/index.html", file)
|
|
556
|
+
|
|
557
|
+
if file.length > 0
|
|
558
|
+
decompressor.extract(file, "index.html")
|
|
559
|
+
end
|
|
560
|
+
----
|
|
561
|
+
|
|
562
|
+
===== Create CHM file
|
|
563
|
+
|
|
564
|
+
[source,ruby]
|
|
565
|
+
----
|
|
566
|
+
compressor = Cabriolet::CHM::Compressor.new
|
|
567
|
+
|
|
568
|
+
# Add files
|
|
569
|
+
compressor.add_file("index.html", "/index.html", section: :compressed)
|
|
570
|
+
compressor.add_file("image.png", "/images/image.png", section: :uncompressed)
|
|
571
|
+
|
|
572
|
+
# Generate CHM
|
|
573
|
+
bytes = compressor.generate("help.chm",
|
|
574
|
+
window_bits: 16,
|
|
575
|
+
language_id: 0x0409)
|
|
576
|
+
|
|
577
|
+
puts "Created help.chm (#{bytes} bytes)"
|
|
578
|
+
----
|
|
579
|
+
|
|
580
|
+
**Options**:
|
|
581
|
+
|
|
582
|
+
* `window_bits` - LZX window size (15-21, default: 16)
|
|
583
|
+
* `language_id` - Language identifier (default: 0x0409 for English US)
|
|
584
|
+
* `timestamp` - Custom timestamp (default: current time)
|
|
585
|
+
|
|
586
|
+
==== SZDD operations
|
|
587
|
+
|
|
588
|
+
===== Expand SZDD file
|
|
589
|
+
|
|
590
|
+
[source,ruby]
|
|
591
|
+
----
|
|
592
|
+
decompressor = Cabriolet::SZDD::Decompressor.new
|
|
593
|
+
|
|
594
|
+
# Open and get header
|
|
595
|
+
header = decompressor.open("file.tx_")
|
|
596
|
+
|
|
597
|
+
puts "Format: #{header.format_name}"
|
|
598
|
+
puts "Length: #{header.length} bytes"
|
|
599
|
+
puts "Missing char: #{header.missing_char}" if header.missing_char
|
|
600
|
+
|
|
601
|
+
# Extract
|
|
602
|
+
decompressor.extract(header, "file.txt")
|
|
603
|
+
|
|
604
|
+
# Or one-shot
|
|
605
|
+
decompressor.decompress("file.tx_", "file.txt")
|
|
606
|
+
----
|
|
607
|
+
|
|
608
|
+
===== Compress to SZDD
|
|
609
|
+
|
|
610
|
+
[source,ruby]
|
|
611
|
+
----
|
|
612
|
+
compressor = Cabriolet::SZDD::Compressor.new
|
|
613
|
+
|
|
614
|
+
# Compress file
|
|
615
|
+
bytes = compressor.compress("file.txt", "file.tx_",
|
|
616
|
+
missing_char: "t",
|
|
617
|
+
format: :normal)
|
|
618
|
+
|
|
619
|
+
# Or compress data from memory
|
|
620
|
+
bytes = compressor.compress_data("Hello, world!", "output.tx_")
|
|
621
|
+
----
|
|
622
|
+
|
|
623
|
+
**Format options**:
|
|
624
|
+
|
|
625
|
+
* `:normal` - Standard SZDD format (MS-DOS compatible)
|
|
626
|
+
* `:qbasic` - QBasic SZDD format
|
|
627
|
+
|
|
628
|
+
==== KWAJ operations
|
|
629
|
+
|
|
630
|
+
===== Extract KWAJ file
|
|
631
|
+
|
|
632
|
+
[source,ruby]
|
|
633
|
+
----
|
|
634
|
+
decompressor = Cabriolet::KWAJ::Decompressor.new
|
|
635
|
+
|
|
636
|
+
# Open and get header
|
|
637
|
+
header = decompressor.open("setup.kwj")
|
|
638
|
+
|
|
639
|
+
puts "Compression: #{header.compression_name}"
|
|
640
|
+
puts "Length: #{header.length} bytes" if header.length
|
|
641
|
+
puts "Filename: #{header.filename}" if header.filename
|
|
642
|
+
|
|
643
|
+
# Extract
|
|
644
|
+
decompressor.extract(header, "setup.kwj", "output.exe")
|
|
645
|
+
|
|
646
|
+
# Or one-shot
|
|
647
|
+
decompressor.decompress("setup.kwj", "setup.exe")
|
|
648
|
+
----
|
|
649
|
+
|
|
650
|
+
===== Compress to KWAJ
|
|
651
|
+
|
|
652
|
+
[source,ruby]
|
|
653
|
+
----
|
|
654
|
+
compressor = Cabriolet::KWAJ::Compressor.new
|
|
655
|
+
|
|
656
|
+
# Compress file
|
|
657
|
+
bytes = compressor.compress("file.exe", "file.kwj",
|
|
658
|
+
compression: :szdd,
|
|
659
|
+
include_length: true,
|
|
660
|
+
filename: "original.exe")
|
|
661
|
+
|
|
662
|
+
# Compression options: :none, :xor, :szdd, :mszip
|
|
663
|
+
----
|
|
664
|
+
|
|
665
|
+
==== HLP (Windows Help) operations
|
|
666
|
+
|
|
667
|
+
===== Extract HLP file
|
|
668
|
+
|
|
669
|
+
[source,ruby]
|
|
670
|
+
----
|
|
671
|
+
decompressor = Cabriolet::HLP::Decompressor.new
|
|
672
|
+
hlp = decompressor.open("help.hlp")
|
|
673
|
+
|
|
674
|
+
# Extract files
|
|
675
|
+
hlp.files.each do |file|
|
|
676
|
+
decompressor.extract_file(file, "output/#{file.filename}")
|
|
677
|
+
end
|
|
678
|
+
----
|
|
679
|
+
|
|
680
|
+
===== Create HLP file
|
|
681
|
+
|
|
682
|
+
[source,ruby]
|
|
683
|
+
----
|
|
684
|
+
compressor = Cabriolet::HLP::Compressor.new
|
|
685
|
+
|
|
686
|
+
# Add files
|
|
687
|
+
compressor.add_file("topic1.txt", "topic1")
|
|
688
|
+
compressor.add_file("topic2.txt", "topic2")
|
|
689
|
+
|
|
690
|
+
# Generate HLP
|
|
691
|
+
bytes = compressor.generate("help.hlp")
|
|
692
|
+
----
|
|
693
|
+
|
|
694
|
+
NOTE: HLP format has no public specification. Implementation is based on
|
|
695
|
+
libmspack source code.
|
|
696
|
+
|
|
697
|
+
==== LIT (eBook) operations
|
|
698
|
+
|
|
699
|
+
===== Extract LIT file
|
|
700
|
+
|
|
701
|
+
[source,ruby]
|
|
702
|
+
----
|
|
703
|
+
decompressor = Cabriolet::LIT::Decompressor.new
|
|
704
|
+
|
|
705
|
+
begin
|
|
706
|
+
lit = decompressor.open("book.lit")
|
|
707
|
+
|
|
708
|
+
if lit.encrypted
|
|
709
|
+
raise "LIT file is DRM-encrypted. Decryption not supported."
|
|
710
|
+
end
|
|
711
|
+
|
|
712
|
+
# Extract files
|
|
713
|
+
lit.files.each do |file|
|
|
714
|
+
decompressor.extract_file(file, "output/#{file.filename}")
|
|
715
|
+
end
|
|
716
|
+
rescue NotImplementedError => e
|
|
717
|
+
puts "Error: #{e.message}"
|
|
718
|
+
end
|
|
719
|
+
----
|
|
720
|
+
|
|
721
|
+
===== Create LIT file
|
|
722
|
+
|
|
723
|
+
[source,ruby]
|
|
724
|
+
----
|
|
725
|
+
compressor = Cabriolet::LIT::Compressor.new
|
|
726
|
+
|
|
727
|
+
compressor.add_file("content.html", "/content.html")
|
|
728
|
+
bytes = compressor.generate("book.lit")
|
|
729
|
+
----
|
|
730
|
+
|
|
731
|
+
**Limitations**:
|
|
732
|
+
|
|
733
|
+
* DES encryption (DRM) is intentionally not supported
|
|
734
|
+
* For encrypted LIT files, decrypt with Microsoft Reader first
|
|
735
|
+
|
|
736
|
+
==== OAB (Offline Address Book) operations
|
|
737
|
+
|
|
738
|
+
===== Extract OAB file
|
|
739
|
+
|
|
740
|
+
[source,ruby]
|
|
741
|
+
----
|
|
742
|
+
decompressor = Cabriolet::OAB::Decompressor.new
|
|
743
|
+
|
|
744
|
+
# Extract full file
|
|
745
|
+
decompressor.decompress("contacts.lzx", "contacts.oab")
|
|
746
|
+
|
|
747
|
+
# Apply incremental patch
|
|
748
|
+
decompressor.decompress_incremental("patch.lzx", "base.oab", "new.oab")
|
|
749
|
+
----
|
|
750
|
+
|
|
751
|
+
===== Create OAB file
|
|
752
|
+
|
|
753
|
+
[source,ruby]
|
|
754
|
+
----
|
|
755
|
+
compressor = Cabriolet::OAB::Compressor.new
|
|
756
|
+
|
|
757
|
+
# Compress full file
|
|
758
|
+
compressor.compress("contacts.oab", "contacts.lzx")
|
|
759
|
+
|
|
760
|
+
# Create incremental patch
|
|
761
|
+
compressor.compress_incremental("new.oab", "old.oab", "patch.lzx")
|
|
762
|
+
----
|
|
763
|
+
|
|
764
|
+
=== Custom I/O Handlers
|
|
765
|
+
|
|
766
|
+
==== In-memory operations
|
|
767
|
+
|
|
768
|
+
[source,ruby]
|
|
769
|
+
----
|
|
770
|
+
# Create custom I/O system
|
|
771
|
+
memory_io = Cabriolet::System::IOSystem.new
|
|
772
|
+
|
|
773
|
+
# Process entirely in memory
|
|
774
|
+
decompressor = Cabriolet::CAB::Decompressor.new(memory_io)
|
|
775
|
+
|
|
776
|
+
# Load CAB data
|
|
777
|
+
cab_data = File.binread("example.cab")
|
|
778
|
+
input = Cabriolet::System::MemoryHandle.new(cab_data)
|
|
779
|
+
cabinet = decompressor.parser.parse_handle(input, "example.cab")
|
|
780
|
+
|
|
781
|
+
# Extract to memory
|
|
782
|
+
file = cabinet.files.first
|
|
783
|
+
output = Cabriolet::System::MemoryHandle.new("", Cabriolet::Constants::MODE_WRITE)
|
|
784
|
+
# ... extract to memory handle
|
|
785
|
+
----
|
|
786
|
+
|
|
787
|
+
==== Custom I/O system
|
|
788
|
+
|
|
789
|
+
[source,ruby]
|
|
790
|
+
----
|
|
791
|
+
class CustomIOSystem < Cabriolet::System::IOSystem
|
|
792
|
+
def open(filename, mode)
|
|
793
|
+
# Custom open logic
|
|
794
|
+
end
|
|
795
|
+
|
|
796
|
+
def read(handle, bytes)
|
|
797
|
+
# Custom read logic
|
|
798
|
+
end
|
|
799
|
+
|
|
800
|
+
# ... implement other methods
|
|
801
|
+
end
|
|
802
|
+
|
|
803
|
+
# Use custom I/O
|
|
804
|
+
custom_io = CustomIOSystem.new
|
|
805
|
+
decompressor = Cabriolet::CAB::Decompressor.new(custom_io)
|
|
806
|
+
----
|
|
807
|
+
|
|
808
|
+
=== Error Handling
|
|
809
|
+
|
|
810
|
+
==== Common errors
|
|
811
|
+
|
|
812
|
+
[source,ruby]
|
|
813
|
+
----
|
|
814
|
+
begin
|
|
815
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
816
|
+
cabinet = decompressor.open("example.cab")
|
|
817
|
+
decompressor.extract_all(cabinet, "output/")
|
|
818
|
+
rescue Cabriolet::IOError => e
|
|
819
|
+
puts "I/O error: #{e.message}"
|
|
820
|
+
rescue Cabriolet::ParseError => e
|
|
821
|
+
puts "Parse error: #{e.message}"
|
|
822
|
+
rescue Cabriolet::ChecksumError => e
|
|
823
|
+
puts "Checksum failed: #{e.message}"
|
|
824
|
+
rescue Cabriolet::DecompressionError => e
|
|
825
|
+
puts "Decompression error: #{e.message}"
|
|
826
|
+
rescue Cabriolet::Error => e
|
|
827
|
+
puts "General error: #{e.message}"
|
|
828
|
+
end
|
|
829
|
+
----
|
|
830
|
+
|
|
831
|
+
==== Salvage mode for corrupted files
|
|
832
|
+
|
|
833
|
+
[source,ruby]
|
|
834
|
+
----
|
|
835
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
836
|
+
decompressor.salvage = true # Enable error recovery
|
|
837
|
+
|
|
838
|
+
# Will skip bad files and continue
|
|
839
|
+
cabinet = decompressor.open("corrupted.cab")
|
|
840
|
+
decompressor.extract_all(cabinet, "output/")
|
|
841
|
+
----
|
|
842
|
+
|
|
843
|
+
==== Fix MSZIP errors
|
|
844
|
+
|
|
845
|
+
[source,ruby]
|
|
846
|
+
----
|
|
847
|
+
decompressor = Cabriolet::CAB::Decompressor.new
|
|
848
|
+
decompressor.fix_mszip = true # Ignore MSZIP checksums, recover from errors
|
|
849
|
+
|
|
850
|
+
cabinet = decompressor.open("example.cab")
|
|
851
|
+
decompressor.extract_all(cabinet, "output/")
|
|
852
|
+
----
|
|
853
|
+
|
|
854
|
+
=== API Reference
|
|
855
|
+
|
|
856
|
+
==== Cabriolet::CAB::Decompressor
|
|
857
|
+
|
|
858
|
+
Main class for CAB file operations.
|
|
859
|
+
|
|
860
|
+
===== Class methods
|
|
861
|
+
|
|
862
|
+
`new(io_system = nil)`::
|
|
863
|
+
Creates a new decompressor instance.
|
|
864
|
+
+
|
|
865
|
+
Parameters:::
|
|
866
|
+
`io_system`::: Optional custom I/O system implementation
|
|
867
|
+
+
|
|
868
|
+
Returns:::
|
|
869
|
+
`Cabriolet::CAB::Decompressor`::: New decompressor instance
|
|
870
|
+
|
|
871
|
+
===== Instance methods
|
|
872
|
+
|
|
873
|
+
`open(filename)`::
|
|
874
|
+
Opens and parses a CAB file.
|
|
875
|
+
+
|
|
876
|
+
Parameters:::
|
|
877
|
+
`filename`::: Path to CAB file
|
|
878
|
+
+
|
|
879
|
+
Returns:::
|
|
880
|
+
`Cabriolet::Models::Cabinet`::: Parsed cabinet object
|
|
881
|
+
+
|
|
882
|
+
Raises:::
|
|
883
|
+
`Cabriolet::ParseError`::: If file is not valid CAB format
|
|
884
|
+
`Cabriolet::IOError`::: If file cannot be opened
|
|
885
|
+
|
|
886
|
+
`extract_file(file, output_path, **options)`::
|
|
887
|
+
Extracts a single file from the cabinet.
|
|
888
|
+
+
|
|
889
|
+
Parameters:::
|
|
890
|
+
`file`::: `Cabriolet::Models::File` object
|
|
891
|
+
`output_path`::: Where to write the file
|
|
892
|
+
`options`::: Optional hash (salvage, overwrite, etc.)
|
|
893
|
+
+
|
|
894
|
+
Returns:::
|
|
895
|
+
`Integer`::: Number of bytes extracted
|
|
896
|
+
|
|
897
|
+
`extract_all(cabinet, output_dir, **options)`::
|
|
898
|
+
Extracts all files from the cabinet.
|
|
899
|
+
+
|
|
900
|
+
Parameters:::
|
|
901
|
+
`cabinet`::: `Cabriolet::Models::Cabinet` object
|
|
902
|
+
`output_dir`::: Directory to extract to
|
|
903
|
+
`options`::: Optional hash
|
|
904
|
+
+
|
|
905
|
+
Returns:::
|
|
906
|
+
`Integer`::: Number of files extracted
|
|
907
|
+
|
|
908
|
+
`search(filename)`::
|
|
909
|
+
Searches for embedded cabinets in a file.
|
|
910
|
+
+
|
|
911
|
+
Parameters:::
|
|
912
|
+
`filename`::: File to search
|
|
913
|
+
+
|
|
914
|
+
Returns:::
|
|
915
|
+
`Cabriolet::Models::Cabinet`::: First found cabinet (use `.next` for others)
|
|
916
|
+
`nil`::: If no cabinets found
|
|
917
|
+
|
|
918
|
+
`append(cabinet, next_cabinet)`::
|
|
919
|
+
Merges two cabinets in a multi-part set.
|
|
920
|
+
+
|
|
921
|
+
Parameters:::
|
|
922
|
+
`cabinet`::: First cabinet
|
|
923
|
+
`next_cabinet`::: Next cabinet in sequence
|
|
924
|
+
+
|
|
925
|
+
Returns:::
|
|
926
|
+
`void`
|
|
927
|
+
|
|
928
|
+
===== Attributes
|
|
929
|
+
|
|
930
|
+
`buffer_size`::
|
|
931
|
+
I/O buffer size in bytes (default: 4096)
|
|
932
|
+
|
|
933
|
+
`salvage`::
|
|
934
|
+
Enable salvage mode for corrupted files (default: false)
|
|
935
|
+
|
|
936
|
+
`fix_mszip`::
|
|
937
|
+
Enable MSZIP error recovery (default: false)
|
|
938
|
+
|
|
939
|
+
==== Cabriolet::CAB::Compressor
|
|
940
|
+
|
|
941
|
+
Class for creating CAB files.
|
|
942
|
+
|
|
943
|
+
===== Instance methods
|
|
944
|
+
|
|
945
|
+
`add_file(source_path, cab_path = nil)`::
|
|
946
|
+
Adds a file to the cabinet.
|
|
947
|
+
+
|
|
948
|
+
Parameters:::
|
|
949
|
+
`source_path`::: Path to source file
|
|
950
|
+
`cab_path`::: Path within cabinet (optional, defaults to basename)
|
|
951
|
+
|
|
952
|
+
`generate(output_file, **options)`::
|
|
953
|
+
Generates the cabinet file.
|
|
954
|
+
+
|
|
955
|
+
Parameters:::
|
|
956
|
+
`output_file`::: Path to output CAB file
|
|
957
|
+
`options`::: Hash with compression, set_id, etc.
|
|
958
|
+
+
|
|
959
|
+
Returns:::
|
|
960
|
+
`Integer`::: Bytes written
|
|
961
|
+
|
|
962
|
+
**Example**:
|
|
963
|
+
[source,ruby]
|
|
964
|
+
----
|
|
965
|
+
compressor = Cabriolet::CAB::Compressor.new
|
|
966
|
+
compressor.add_file("file1.txt")
|
|
967
|
+
compressor.add_file("file2.txt")
|
|
968
|
+
bytes = compressor.generate("output.cab", compression: :mszip)
|
|
969
|
+
----
|
|
970
|
+
|
|
971
|
+
==== Compression Algorithm Status
|
|
972
|
+
|
|
973
|
+
[cols="1,1,1,3"]
|
|
974
|
+
|===
|
|
975
|
+
| Algorithm | Decompression | Compression | Notes
|
|
976
|
+
|
|
977
|
+
| **None**
|
|
978
|
+
| ✅ Working
|
|
979
|
+
| ✅ Working
|
|
980
|
+
| Uncompressed storage
|
|
981
|
+
|
|
982
|
+
| **LZSS**
|
|
983
|
+
| ✅ Working
|
|
984
|
+
| ✅ Working
|
|
985
|
+
| 4KB sliding window, 3 modes (EXPAND, MSHELP, QBASIC)
|
|
986
|
+
|
|
987
|
+
| **MSZIP**
|
|
988
|
+
| ✅ Working
|
|
989
|
+
| ✅ Working
|
|
990
|
+
| DEFLATE/RFC 1951, fixed Huffman
|
|
991
|
+
|
|
992
|
+
| **LZX**
|
|
993
|
+
| ✅ Working
|
|
994
|
+
| ✅ Working
|
|
995
|
+
| UNCOMPRESSED blocks, 32KB-2MB window
|
|
996
|
+
|
|
997
|
+
| **Quantum**
|
|
998
|
+
| ✅ Working
|
|
999
|
+
| ⚠️ Functional
|
|
1000
|
+
| Literals + short matches work. Complex patterns pending.
|
|
1001
|
+
|===
|
|
1002
|
+
|
|
1003
|
+
=== Configuration Options
|
|
1004
|
+
|
|
1005
|
+
==== Buffer Sizes
|
|
1006
|
+
|
|
1007
|
+
[source,ruby]
|
|
1008
|
+
----
|
|
1009
|
+
# Set default buffer size globally
|
|
1010
|
+
Cabriolet.default_buffer_size = 8192
|
|
1011
|
+
|
|
1012
|
+
# Or per decompressor
|
|
1013
|
+
decompressor.buffer_size = 16384
|
|
1014
|
+
----
|
|
1015
|
+
|
|
1016
|
+
==== Verbose Output
|
|
1017
|
+
|
|
1018
|
+
[source,ruby]
|
|
1019
|
+
----
|
|
1020
|
+
# Enable verbose output globally
|
|
1021
|
+
Cabriolet.verbose = true
|
|
1022
|
+
|
|
1023
|
+
# Or use --verbose flag in CLI
|
|
1024
|
+
# cabriolet extract file.cab --verbose
|
|
1025
|
+
----
|
|
1026
|
+
|
|
1027
|
+
=== Compression Algorithm Selection Guide
|
|
1028
|
+
|
|
1029
|
+
[cols="1,1,1,1,3"]
|
|
1030
|
+
|===
|
|
1031
|
+
| Algorithm | Ratio | Speed | Complexity | Use Case
|
|
1032
|
+
|
|
1033
|
+
| **None**
|
|
1034
|
+
| 1:1
|
|
1035
|
+
| Fastest
|
|
1036
|
+
| Trivial
|
|
1037
|
+
| Already compressed data, testing
|
|
1038
|
+
|
|
1039
|
+
| **LZSS**
|
|
1040
|
+
| 2-3:1
|
|
1041
|
+
| Fast
|
|
1042
|
+
| Low
|
|
1043
|
+
| Small files, compatibility
|
|
1044
|
+
|
|
1045
|
+
| **MSZIP**
|
|
1046
|
+
| 3-5:1
|
|
1047
|
+
| Medium
|
|
1048
|
+
| Medium
|
|
1049
|
+
| **Recommended** for most uses
|
|
1050
|
+
|
|
1051
|
+
| **LZX**
|
|
1052
|
+
| 5-10:1
|
|
1053
|
+
| Slow
|
|
1054
|
+
| High
|
|
1055
|
+
| Large files, best compression
|
|
1056
|
+
|
|
1057
|
+
| **Quantum**
|
|
1058
|
+
| 4-8:1
|
|
1059
|
+
| Medium
|
|
1060
|
+
| Very High
|
|
1061
|
+
| Experimental, use with caution
|
|
1062
|
+
|===
|
|
1063
|
+
|
|
1064
|
+
=== Return values
|
|
1065
|
+
|
|
1066
|
+
All methods return appropriate values or raise exceptions:
|
|
1067
|
+
|
|
1068
|
+
* **Decompression methods**: Return bytes extracted or raise error
|
|
1069
|
+
* **Compression methods**: Return bytes written or raise error
|
|
1070
|
+
* **Parse methods**: Return model objects or raise `ParseError`
|
|
1071
|
+
* **File operations**: Return file handles or raise `IOError`
|
|
1072
|
+
|
|
1073
|
+
== Development
|
|
1074
|
+
|
|
1075
|
+
=== Building from source
|
|
1076
|
+
|
|
1077
|
+
[source,shell]
|
|
1078
|
+
----
|
|
1079
|
+
git clone https://github.com/omnizip/cabriolet.git
|
|
1080
|
+
cd cabriolet
|
|
1081
|
+
bundle install
|
|
1082
|
+
bundle exec rake
|
|
1083
|
+
----
|
|
1084
|
+
|
|
1085
|
+
=== Running tests
|
|
1086
|
+
|
|
1087
|
+
[source,shell]
|
|
1088
|
+
----
|
|
1089
|
+
bundle exec rspec
|
|
1090
|
+
----
|
|
1091
|
+
|
|
1092
|
+
|
|
1093
|
+
=== Running RuboCop
|
|
1094
|
+
|
|
1095
|
+
[source,shell]
|
|
1096
|
+
----
|
|
1097
|
+
bundle exec rubocop
|
|
1098
|
+
bundle exec rubocop -A # Auto-correct
|
|
1099
|
+
----
|
|
1100
|
+
|
|
1101
|
+
|
|
1102
|
+
== Known limitations
|
|
1103
|
+
|
|
1104
|
+
=== Quantum compression
|
|
1105
|
+
|
|
1106
|
+
Quantum compression is **functional but experimental**:
|
|
1107
|
+
|
|
1108
|
+
* ✅ **Decompression**: Fully working, production ready
|
|
1109
|
+
* ✅ **Compression**: Working for:
|
|
1110
|
+
** Simple literals
|
|
1111
|
+
** Short matches (3-4 bytes)
|
|
1112
|
+
** Basic patterns
|
|
1113
|
+
* ⚠️ **Limitations**:
|
|
1114
|
+
** Complex repeated patterns may fail
|
|
1115
|
+
** Very long matches (14+ bytes) have encoding issues
|
|
1116
|
+
** Recommended: Use LZSS, MSZIP, or LZX instead
|
|
1117
|
+
|
|
1118
|
+
=== LIT Format
|
|
1119
|
+
|
|
1120
|
+
* DES encryption (DRM) intentionally not supported
|
|
1121
|
+
* For DRM-protected LIT files, decrypt with Microsoft Reader first
|
|
1122
|
+
|
|
1123
|
+
=== HLP/LIT/OAB Formats
|
|
1124
|
+
|
|
1125
|
+
* No public format specifications available
|
|
1126
|
+
* Implementation based on libmspack source code
|
|
1127
|
+
* Cannot be fully validated without real test files
|
|
1128
|
+
* Basic functionality working, edge cases may exist
|
|
1129
|
+
|
|
1130
|
+
|
|
1131
|
+
== Troubleshooting
|
|
1132
|
+
|
|
1133
|
+
=== Extraction failures
|
|
1134
|
+
|
|
1135
|
+
Problem:: Invalid CAB signature
|
|
1136
|
+
|
|
1137
|
+
Solution:: File may not be a CAB, or is corrupted. Try salvage mode:
|
|
1138
|
+
|
|
1139
|
+
[source,shell]
|
|
1140
|
+
----
|
|
1141
|
+
cabriolet extract --salvage corrupted.cab
|
|
1142
|
+
----
|
|
1143
|
+
|
|
1144
|
+
Problem:: Checksum mismatch
|
|
1145
|
+
|
|
1146
|
+
Solution:: Enable error recovery:
|
|
1147
|
+
|
|
1148
|
+
[source,ruby]
|
|
1149
|
+
----
|
|
1150
|
+
decompressor.fix_mszip = true
|
|
1151
|
+
decompressor.salvage = true
|
|
1152
|
+
----
|
|
1153
|
+
|
|
1154
|
+
=== Performance issues
|
|
1155
|
+
|
|
1156
|
+
Problem:: Slow extraction
|
|
1157
|
+
|
|
1158
|
+
Solution:: Increase buffer size:
|
|
1159
|
+
|
|
1160
|
+
[source,ruby]
|
|
1161
|
+
----
|
|
1162
|
+
decompressor.buffer_size = 16384
|
|
1163
|
+
----
|
|
1164
|
+
|
|
1165
|
+
|
|
1166
|
+
== Specifications
|
|
1167
|
+
|
|
1168
|
+
* https://en.wikipedia.org/wiki/Cabinet_(file_format)[Microsoft Cabinet File Format - Wikipedia]
|
|
1169
|
+
* https://www.rfc-editor.org/info/rfc1951[RFC 1951: DEFLATE Compressed Data Format Specification version 1.3, MAY 1996]
|
|
1170
|
+
* https://learn.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-patch/cc78752a-b4af-4eee-88cb-01f4d8a4c2bf[[MS-PATCH\]: LZX DELTA Compression and Decompression]
|
|
1171
|
+
|
|
1172
|
+
|
|
1173
|
+
== Acknowledgments
|
|
1174
|
+
|
|
1175
|
+
A special thank you to Stuart Caie (aka Kyzer) who created the original
|
|
1176
|
+
libmspack and cabextract projects, and their contributors for:
|
|
1177
|
+
|
|
1178
|
+
* Comprehensive CAB format implementation
|
|
1179
|
+
* Excellent test coverage and test fixtures
|
|
1180
|
+
* Clear format documentation
|
|
1181
|
+
|
|
1182
|
+
Link to the libmspack/cabextract project:
|
|
1183
|
+
https://www.cabextract.org.uk/libmspack/
|
|
1184
|
+
|
|
1185
|
+
Cabriolet is inspired by and builds upon the foundation laid by these projects.
|
|
1186
|
+
|
|
1187
|
+
If performance is critical, Cabriolet is not the best choice. Consider using
|
|
1188
|
+
https://github.com/davispuh/ruby-libmspack[libmspack via FFI] for optimized
|
|
1189
|
+
speed.
|
|
1190
|
+
|
|
1191
|
+
|
|
1192
|
+
== License
|
|
1193
|
+
|
|
1194
|
+
BSD 3-Clause License. See link:LICENSE[LICENSE] file for details.
|
|
1195
|
+
|
|
1196
|
+
Some test fixtures are from third-party projects. Test fixtures are **NOT**
|
|
1197
|
+
distributed with the gem and are only used for development and testing purposes.
|
|
1198
|
+
|
|
1199
|
+
These fixtures are sourced from the respective projects and retain their
|
|
1200
|
+
original licenses:
|
|
1201
|
+
|
|
1202
|
+
* Test fixtures in `spec/fixtures/libmspack/` are from the libmspack project
|
|
1203
|
+
(LGPL 2.1).
|
|
1204
|
+
|
|
1205
|
+
* Test fixtures in `spec/fixtures/cabextract/` are from cabextract (GPL 2.0+).
|
|
1206
|
+
|
|
1207
|
+
See fixture directories for individual attribution files.
|