cabriolet 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4a38d30bad112e61071cb4bdc7bba28f643157ebccb05ddb2c06287c34170390
4
- data.tar.gz: 7fd1a46132a0f7819c6d0403017a6dd1168768d6771bf0a5fcdad889b71017be
3
+ metadata.gz: edd5ee62d319aad301cffc319c13a9ec24c311aff284f77fd5a0612f598c14bd
4
+ data.tar.gz: b5bee141d6c010f845ff58104f4e99217c5f7c0fbbec5af94911e100fa2864e5
5
5
  SHA512:
6
- metadata.gz: e94daccfc6d845112245662a0e0fc82a9143c48c44b4e5cd85d4f1366efa06552d4652a9a476489431f7baa6a0440a4f1be7ad0b4729c13e8918cb389ce91107
7
- data.tar.gz: 3daf013cdcfa9a113fea8b71b5c105b8472b1abc6efd876651db8045a4abb9cf03914a0c3ed6397790149a22b67390ccc9e84a9f064028d95f2fde67b6139f5d
6
+ metadata.gz: 14f0bd63386aa18cf45dc0d4524513ee6cca16eb4537084244279c5f61c9b089dc519888083d498f0f0f1bdbd909716278e7787e9df28f594a93f2588ce61341
7
+ data.tar.gz: 007c90c8cb3aec1c43ff240810d532ed4ab237ec0b95713121536e64d100cced3dd150b1be76d8ff92fb1ed4a3ac236d4effd7e3f4cb4f9f35638c32109e456b
@@ -0,0 +1,27 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Cabriolet
4
+ # Platform detection for handling OS-specific behavior
5
+ module Platform
6
+ # Check if running on Windows
7
+ #
8
+ # @return [Boolean] true if on Windows (including MinGW, Cygwin)
9
+ def self.windows?
10
+ RUBY_PLATFORM =~ /mswin|mingw|cygwin/
11
+ end
12
+
13
+ # Check if running on Unix-like system
14
+ #
15
+ # @return [Boolean] true if on Unix (Linux, macOS, BSD, etc.)
16
+ def self.unix?
17
+ !windows?
18
+ end
19
+
20
+ # Check if platform supports Unix file permissions
21
+ #
22
+ # @return [Boolean] true if platform supports chmod with Unix permission bits
23
+ def self.supports_unix_permissions?
24
+ unix?
25
+ end
26
+ end
27
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Cabriolet
4
- VERSION = "0.1.0"
4
+ VERSION = "0.1.2"
5
5
  end
data/lib/cabriolet.rb CHANGED
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require_relative "cabriolet/version"
4
- require_relative "cabriolet/errors"
4
+ require_relative "cabriolet/platform"
5
5
  require_relative "cabriolet/constants"
6
6
 
7
7
  # Cabriolet is a pure Ruby library for extracting Microsoft Cabinet (.CAB) files,
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cabriolet
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
@@ -49,8 +49,6 @@ executables:
49
49
  extensions: []
50
50
  extra_rdoc_files: []
51
51
  files:
52
- - ARCHITECTURE.md
53
- - CHANGELOG.md
54
52
  - LICENSE
55
53
  - README.adoc
56
54
  - exe/cabriolet
@@ -115,6 +113,7 @@ files:
115
113
  - lib/cabriolet/oab/compressor.rb
116
114
  - lib/cabriolet/oab/decompressor.rb
117
115
  - lib/cabriolet/parallel.rb
116
+ - lib/cabriolet/platform.rb
118
117
  - lib/cabriolet/repairer.rb
119
118
  - lib/cabriolet/streaming.rb
120
119
  - lib/cabriolet/system/file_handle.rb
data/ARCHITECTURE.md DELETED
@@ -1,799 +0,0 @@
1
- # Cabriolet Architecture Plan
2
-
3
- ## Overview
4
-
5
- **Cabriolet** is a pure Ruby gem for extracting Microsoft compression formats, focusing primarily on CAB (Cabinet) files. This implementation is a Ruby port of libmspack and cabextract.
6
-
7
- ## Goals
8
-
9
- 1. **Pure Ruby Implementation**: No C extensions, fully portable
10
- 2. **Full CAB Format Support**: Handle all compression methods (MSZIP, LZX, Quantum)
11
- 3. **Extensible Design**: Easy to add support for CHM, LIT, HLP formats later
12
- 4. **Well-Tested**: Comprehensive test coverage using libmspack test files
13
- 5. **Performance**: Optimized for reasonable performance while maintaining readability
14
-
15
- ## Source Material
16
-
17
- - **libmspack**: https://github.com/kyz/libmspack (LGPL 2.1)
18
- - **Location**: `/Users/mulgogi/src/external/libmspack`
19
- - **Primary Files to Port**:
20
- - `mspack/cabd.c` - CAB decompressor
21
- - `mspack/lzxd.c` - LZX decompression
22
- - `mspack/mszipd.c` - MSZIP decompression
23
- - `mspack/qtmd.c` - Quantum decompression
24
- - `mspack/lzssd.c` - LZSS decompression
25
- - `mspack/system.c` - I/O abstraction
26
-
27
- ## Architecture
28
-
29
- ### High-Level Structure
30
-
31
- ```
32
- ┌─────────────────────────────────────────────────────────────┐
33
- │ Cabriolet Gem │
34
- ├─────────────────────────────────────────────────────────────┤
35
- │ │
36
- │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
37
- │ │ CLI │ │ Cabinet │ │ Models │ │
38
- │ │ Tool │ │ Extractor │ │ (Lutaml) │ │
39
- │ └──────────────┘ └──────────────┘ └──────────────┘ │
40
- │ │ │ │ │
41
- │ └─────────────────┴──────────────────┘ │
42
- │ │ │
43
- │ ┌────────────────────────┴────────────────────────┐ │
44
- │ │ CAB Decompressor (Core) │ │
45
- │ ├─────────────────────────────────────────────────┤ │
46
- │ │ • Cabinet Parser │ │
47
- │ │ • Folder/File Management │ │
48
- │ │ • Decompression Strategy Selection │ │
49
- │ └─────────────────────────────────────────────────┘ │
50
- │ │ │
51
- │ ┌────────────────────────┴────────────────────────┐ │
52
- │ │ Decompression Algorithms │ │
53
- │ ├─────────────────────────────────────────────────┤ │
54
- │ │ • MSZIP (Deflate) │ │
55
- │ │ • LZX │ │
56
- │ │ • Quantum │ │
57
- │ │ • LZSS │ │
58
- │ │ • None (Uncompressed) │ │
59
- │ └─────────────────────────────────────────────────┘ │
60
- │ │ │
61
- │ ┌────────────────────────┴────────────────────────┐ │
62
- │ │ Foundation Layer │ │
63
- │ ├─────────────────────────────────────────────────┤ │
64
- │ │ • System I/O Abstraction │ │
65
- │ │ • Binary I/O (Endianness handling) │ │
66
- │ │ • Bitstream Reader │ │
67
- │ │ • Huffman Tree Decoder │ │
68
- │ └─────────────────────────────────────────────────┘ │
69
- │ │
70
- └─────────────────────────────────────────────────────────────┘
71
- ```
72
-
73
- ### Directory Structure
74
-
75
- ```
76
- cabriolet/
77
- ├── lib/
78
- │ └── cabriolet/
79
- │ ├── version.rb
80
- │ ├── errors.rb
81
- │ ├── constants.rb
82
- │ │
83
- │ ├── system/ # System abstraction layer
84
- │ │ ├── io_system.rb # File I/O abstraction
85
- │ │ ├── file_handle.rb # File handle wrapper
86
- │ │ └── memory_handle.rb # In-memory I/O
87
- │ │
88
- │ ├── binary/ # Binary I/O utilities
89
- │ │ ├── reader.rb # Binary data reader
90
- │ │ ├── bitstream.rb # Bitstream reader
91
- │ │ └── endian.rb # Endianness handling
92
- │ │
93
- │ ├── huffman/ # Huffman decoding
94
- │ │ ├── tree.rb # Huffman tree structure
95
- │ │ └── decoder.rb # Huffman decoder
96
- │ │
97
- │ ├── models/ # Data models (Lutaml::Model)
98
- │ │ ├── cabinet.rb # Cabinet structure
99
- │ │ ├── folder.rb # Folder structure
100
- │ │ └── file.rb # File structure
101
- │ │
102
- │ ├── decompressors/ # Decompression algorithms
103
- │ │ ├── base.rb # Base decompressor
104
- │ │ ├── none.rb # No compression
105
- │ │ ├── lzss.rb # LZSS algorithm
106
- │ │ ├── mszip.rb # MSZIP (deflate)
107
- │ │ ├── lzx.rb # LZX algorithm
108
- │ │ └── quantum.rb # Quantum algorithm
109
- │ │
110
- │ ├── cab/ # CAB format support
111
- │ │ ├── parser.rb # CAB file parser
112
- │ │ ├── decompressor.rb # Main decompressor
113
- │ │ └── extractor.rb # File extraction
114
- │ │
115
- │ └── cli.rb # Command-line interface
116
-
117
- ├── spec/
118
- │ ├── fixtures/ # Test CAB files
119
- │ ├── system/
120
- │ ├── binary/
121
- │ ├── huffman/
122
- │ ├── models/
123
- │ ├── decompressors/
124
- │ └── cab/
125
-
126
- ├── exe/
127
- │ └── cabriolet # CLI executable
128
-
129
- ├── ARCHITECTURE.md
130
- ├── README.adoc
131
- ├── CHANGELOG.md
132
- ├── LICENSE
133
- ├── Gemfile
134
- └── cabriolet.gemspec
135
- ```
136
-
137
- ## Core Components
138
-
139
- ### 1. System Abstraction Layer
140
-
141
- **Purpose**: Abstract file I/O, memory management, and system calls
142
-
143
- **Files**:
144
- - `system/io_system.rb` - Main I/O abstraction
145
- - `system/file_handle.rb` - File operations wrapper
146
- - `system/memory_handle.rb` - In-memory operations
147
-
148
- **Design**:
149
- ```ruby
150
- module Cabriolet
151
- module System
152
- class IOSystem
153
- def open(filename, mode)
154
- # Returns FileHandle or MemoryHandle
155
- end
156
-
157
- def close(handle)
158
- # Closes the handle
159
- end
160
-
161
- def read(handle, bytes)
162
- # Reads bytes from handle
163
- end
164
-
165
- def write(handle, data)
166
- # Writes data to handle
167
- end
168
-
169
- def seek(handle, offset, whence)
170
- # Seeks to position
171
- end
172
-
173
- def tell(handle)
174
- # Returns current position
175
- end
176
- end
177
- end
178
- end
179
- ```
180
-
181
- ### 2. Binary I/O Layer
182
-
183
- **Purpose**: Handle binary data reading with proper endianness
184
-
185
- **Files**:
186
- - `binary/reader.rb` - Binary data reader
187
- - `binary/bitstream.rb` - Bitstream operations
188
- - `binary/endian.rb` - Endian conversion utilities
189
-
190
- **Key Features**:
191
- - Little-endian integer reading (CAB uses little-endian)
192
- - Bitstream reading for compressed data
193
- - Buffer management
194
-
195
- **Design**:
196
- ```ruby
197
- module Cabriolet
198
- module Binary
199
- class Reader
200
- def read_uint16_le
201
- # Read 16-bit little-endian unsigned integer
202
- end
203
-
204
- def read_uint32_le
205
- # Read 32-bit little-endian unsigned integer
206
- end
207
-
208
- def read_bytes(count)
209
- # Read raw bytes
210
- end
211
- end
212
-
213
- class Bitstream
214
- def initialize(io_system, file_handle, buffer_size)
215
- # Initialize bitstream reader
216
- end
217
-
218
- def read_bits(num_bits)
219
- # Read specified number of bits
220
- end
221
-
222
- def byte_align
223
- # Align to byte boundary
224
- end
225
- end
226
- end
227
- end
228
- ```
229
-
230
- ### 3. Huffman Decoding
231
-
232
- **Purpose**: Decode Huffman-encoded data streams
233
-
234
- **Files**:
235
- - `huffman/tree.rb` - Huffman tree construction
236
- - `huffman/decoder.rb` - Decoding logic
237
-
238
- **Design**:
239
- ```ruby
240
- module Cabriolet
241
- module Huffman
242
- class Tree
243
- def initialize(lengths, num_symbols)
244
- # Build Huffman tree from code lengths
245
- end
246
-
247
- def build_table(table_bits)
248
- # Build fast decode table
249
- end
250
- end
251
-
252
- class Decoder
253
- def decode_symbol(bitstream, table)
254
- # Decode one symbol from bitstream
255
- end
256
- end
257
- end
258
- end
259
- ```
260
-
261
- ### 4. Data Models
262
-
263
- **Purpose**: Represent CAB file structures
264
-
265
- **Files**:
266
- - `models/cabinet.rb`
267
- - `models/folder.rb`
268
- - `models/file.rb`
269
-
270
- **Design** (Plain Ruby classes):
271
- ```ruby
272
- module Cabriolet
273
- module Models
274
- class Cabinet
275
- attr_accessor :filename, :length, :set_id, :set_index, :flags
276
- attr_accessor :folders, :files, :next_cabinet, :prev_cabinet
277
- attr_accessor :base_offset, :header_resv, :prevname, :nextname
278
- attr_accessor :previnfo, :nextinfo
279
-
280
- def initialize
281
- @folders = []
282
- @files = []
283
- end
284
- end
285
-
286
- class Folder
287
- attr_accessor :comp_type, :num_blocks, :data_offset
288
- attr_accessor :next, :data_cab, :merge_prev, :merge_next
289
-
290
- def initialize
291
- @data_cab = nil
292
- @merge_prev = nil
293
- @merge_next = nil
294
- end
295
- end
296
-
297
- class File
298
- attr_accessor :filename, :length, :offset, :folder
299
- attr_accessor :attribs, :date, :time
300
- attr_accessor :time_h, :time_m, :time_s
301
- attr_accessor :date_d, :date_m, :date_y
302
- attr_accessor :next
303
-
304
- def initialize
305
- @next = nil
306
- end
307
- end
308
- end
309
- end
310
- ```
311
-
312
- ### 5. Decompressors
313
-
314
- **Purpose**: Implement compression algorithms
315
-
316
- **Base Class**:
317
- ```ruby
318
- module Cabriolet
319
- module Decompressors
320
- class Base
321
- def initialize(io_system, input_handle, output_handle, buffer_size)
322
- @io_system = io_system
323
- @input = input_handle
324
- @output = output_handle
325
- @buffer_size = buffer_size
326
- end
327
-
328
- def decompress(bytes)
329
- # Abstract method - implemented by subclasses
330
- raise NotImplementedError
331
- end
332
- end
333
- end
334
- end
335
- ```
336
-
337
- **Subclasses**:
338
-
339
- 1. **LZSS** (`decompressors/lzss.rb`):
340
- - Window size: 4096 bytes
341
- - Used by SZDD, KWAJ formats
342
- - Simple sliding window compression
343
-
344
- 2. **MSZIP** (`decompressors/mszip.rb`):
345
- - Deflate algorithm (RFC 1951)
346
- - 32KB sliding window
347
- - Huffman coding + LZ77
348
-
349
- 3. **LZX** (`decompressors/lzx.rb`):
350
- - Window sizes: 32KB to 2MB
351
- - Intel E8 preprocessing
352
- - Multiple Huffman trees
353
-
354
- 4. **Quantum** (`decompressors/quantum.rb`):
355
- - Proprietary format
356
- - Complex algorithm
357
- - Huffman coding + sliding window
358
-
359
- 5. **None** (`decompressors/none.rb`):
360
- - Simple copy operation
361
- - No decompression
362
-
363
- ### 6. CAB Format Support
364
-
365
- **Parser** (`cab/parser.rb`):
366
- ```ruby
367
- module Cabriolet
368
- module CAB
369
- class Parser
370
- def initialize(io_system)
371
- @io_system = io_system
372
- end
373
-
374
- def parse(filename)
375
- # Parse CAB file headers
376
- # Returns Cabinet model
377
- end
378
-
379
- private
380
-
381
- def read_header(handle)
382
- # Read CFHEADER structure
383
- end
384
-
385
- def read_folders(handle, count)
386
- # Read CFFOLDER structures
387
- end
388
-
389
- def read_files(handle, count)
390
- # Read CFFILE structures
391
- end
392
- end
393
- end
394
- end
395
- ```
396
-
397
- **Decompressor** (`cab/decompressor.rb`):
398
- ```ruby
399
- module Cabriolet
400
- module CAB
401
- class Decompressor
402
- def initialize(io_system = nil)
403
- @io_system = io_system || System::IOSystem.new
404
- @parser = Parser.new(@io_system)
405
- end
406
-
407
- def open(filename)
408
- # Open and parse CAB file
409
- @parser.parse(filename)
410
- end
411
-
412
- def extract(file, output_filename)
413
- # Extract a single file
414
- end
415
-
416
- def extract_all(output_directory)
417
- # Extract all files
418
- end
419
-
420
- private
421
-
422
- def select_decompressor(comp_type)
423
- # Select appropriate decompressor
424
- end
425
- end
426
- end
427
- end
428
- ```
429
-
430
- **Extractor** (`cab/extractor.rb`):
431
- ```ruby
432
- module Cabriolet
433
- module CAB
434
- class Extractor
435
- def initialize(cabinet, io_system)
436
- @cabinet = cabinet
437
- @io_system = io_system
438
- end
439
-
440
- def extract_file(file, output_path)
441
- # Extract single file from cabinet
442
- end
443
- end
444
- end
445
- end
446
- ```
447
-
448
- ### 7. CLI Tool
449
-
450
- **Design** (`cli.rb`):
451
- ```ruby
452
- require 'thor'
453
-
454
- module Cabriolet
455
- class CLI < Thor
456
- desc 'list FILE', 'List contents of CAB file'
457
- def list(file)
458
- # List all files in cabinet
459
- end
460
-
461
- desc 'extract FILE [OUTPUT_DIR]', 'Extract CAB file'
462
- option :verbose, type: :boolean, aliases: '-v'
463
- def extract(file, output_dir = '.')
464
- # Extract files
465
- end
466
-
467
- desc 'info FILE', 'Show CAB file information'
468
- def info(file)
469
- # Show detailed cabinet info
470
- end
471
-
472
- desc 'test FILE', 'Test CAB file integrity'
473
- def test(file)
474
- # Test file integrity
475
- end
476
- end
477
- end
478
- ```
479
-
480
- ## Implementation Phases
481
-
482
- ### Phase 1: Foundation (Weeks 1-2)
483
-
484
- - [x] Project setup (Gemfile, gemspec, RSpec)
485
- - [ ] System abstraction layer
486
- - [ ] Binary I/O utilities
487
- - [ ] Bitstream reader
488
- - [ ] Basic error handling
489
-
490
- ### Phase 2: Format Support (Weeks 3-4)
491
-
492
- - [ ] CAB format constants
493
- - [ ] Data models with Lutaml::Model
494
- - [ ] CAB parser (headers, folders, files)
495
- - [ ] Cabinet search functionality
496
-
497
- ### Phase 3: Basic Decompression (Weeks 5-6)
498
-
499
- - [ ] Base decompressor class
500
- - [ ] None decompressor (uncompressed)
501
- - [ ] LZSS decompressor
502
- - [ ] Basic extraction workflow
503
-
504
- ### Phase 4: MSZIP Support (Weeks 7-8)
505
-
506
- - [ ] Huffman tree builder
507
- - [ ] Huffman decoder
508
- - [ ] MSZIP/Deflate decompressor
509
- - [ ] Integration with CAB extractor
510
-
511
- ### Phase 5: LZX Support (Weeks 9-11)
512
-
513
- - [ ] LZX constants and structures
514
- - [ ] LZX bitstream handling
515
- - [ ] LZX Huffman trees
516
- - [ ] Intel E8 transformation
517
- - [ ] LZX decompressor
518
-
519
- ### Phase 6: Quantum Support (Weeks 12-13)
520
-
521
- - [ ] Quantum algorithm research
522
- - [ ] Quantum decompressor
523
- - [ ] Special case handling
524
-
525
- ### Phase 7: Testing & Polish (Weeks 14-15)
526
-
527
- - [ ] Comprehensive test suite
528
- - [ ] Performance optimization
529
- - [ ] Documentation
530
- - [ ] CLI refinement
531
-
532
- ### Phase 8: Extended Formats (Future)
533
-
534
- - [ ] CHM (HTML Help) format support
535
- - [ ] LIT (eBook) format support
536
- - [ ] HLP (Help) format support
537
-
538
- ## CAB Format Specification
539
-
540
- ### File Structure
541
-
542
- ```
543
- ┌──────────────────────────────┐
544
- │ CFHEADER (36+ bytes) │ Cabinet header
545
- ├──────────────────────────────┤
546
- │ Reserved area (optional) │
547
- ├──────────────────────────────┤
548
- │ Previous cabinet name │ (if flags & 0x01)
549
- ├──────────────────────────────┤
550
- │ Next cabinet name │ (if flags & 0x02)
551
- ├──────────────────────────────┤
552
- │ CFFOLDER[1] (8+ bytes) │ Folder entries
553
- │ CFFOLDER[2] │
554
- │ ... │
555
- ├──────────────────────────────┤
556
- │ CFFILE[1] (16+ bytes) │ File entries
557
- │ CFFILE[2] │
558
- │ ... │
559
- ├──────────────────────────────┤
560
- │ CFDATA[1] (8+ bytes) │ Data blocks
561
- │ Compressed data[1] │
562
- │ CFDATA[2] │
563
- │ Compressed data[2] │
564
- │ ... │
565
- └──────────────────────────────┘
566
- ```
567
-
568
- ### CFHEADER Structure
569
-
570
- ```
571
- Offset Size Description
572
- ------ ---- -----------
573
- 0 4 Signature (0x4643534D = "MSCF")
574
- 4 4 Reserved
575
- 8 4 Cabinet file size
576
- 12 4 Reserved
577
- 16 4 Files offset
578
- 20 4 Reserved
579
- 24 1 Minor version
580
- 25 1 Major version
581
- 26 2 Number of folders
582
- 28 2 Number of files
583
- 30 2 Flags
584
- 32 2 Set ID
585
- 34 2 Cabinet index
586
- ```
587
-
588
- ### Compression Types
589
-
590
- | Type | Value | Description |
591
- |------|-------|-------------|
592
- | None | 0 | No compression |
593
- | MSZIP | 1 | MSZIP (deflate) |
594
- | Quantum | 2 | Quantum compression |
595
- | LZX | 3 | LZX compression |
596
-
597
- ## Testing Strategy
598
-
599
- ### Unit Tests
600
-
601
- - Each decompressor tested independently
602
- - Binary I/O utilities tested with known data
603
- - Huffman decoder tested with sample trees
604
- - Parser tested with valid/invalid CAB files
605
-
606
- ### Integration Tests
607
-
608
- - Full extraction of known CAB files
609
- - Multi-cabinet spanning tests
610
- - Error recovery tests
611
- - Performance benchmarks
612
-
613
- ### Test Data
614
-
615
- #### libmspack Test Fixtures
616
-
617
- Copy test files from libmspack to `spec/fixtures/libmspack/`:
618
-
619
- ```bash
620
- # Directory structure
621
- spec/fixtures/libmspack/
622
- ├── README.adoc # License acknowledgment
623
- └── cabd/
624
- ├── normal_2files_1folder.cab # Basic CAB
625
- ├── mszip_lzx_qtm.cab # Multiple compression
626
- ├── multi_basic_pt1.cab # Multi-part cabinet
627
- ├── multi_basic_pt2.cab
628
- ├── cve-2010-2800-mszip-infinite-loop.cab # Security test
629
- └── ...
630
- ```
631
-
632
- #### Test Coverage Strategy
633
-
634
- Each RSpec file tests its corresponding class:
635
- - **Unit Tests**: Test each class in isolation with mocks/stubs
636
- - **Integration Tests**: Test component interactions
637
- - **End-to-End Tests**: Full extraction workflow with real CAB files
638
-
639
- **Example RSpec structure**:
640
- ```ruby
641
- # spec/decompressors/lzx_spec.rb
642
- RSpec.describe Cabriolet::Decompressors::LZX do
643
- describe '#initialize' do
644
- # Test initialization
645
- end
646
-
647
- describe '#decompress' do
648
- context 'with valid LZX data' do
649
- # Test decompression
650
- end
651
-
652
- context 'with corrupted data' do
653
- # Test error handling
654
- end
655
- end
656
- end
657
- ```
658
-
659
- ## Error Handling
660
-
661
- ### Error Classes
662
-
663
- ```ruby
664
- module Cabriolet
665
- class Error < StandardError; end
666
-
667
- class IOError < Error; end
668
- class ParseError < Error; end
669
- class DecompressionError < Error; end
670
- class ChecksumError < Error; end
671
- class UnsupportedFormatError < Error; end
672
- end
673
- ```
674
-
675
- ### Error Strategy
676
-
677
- 1. **Graceful degradation**: Attempt partial extraction on errors
678
- 2. **Clear messages**: Provide actionable error information
679
- 3. **Salvage mode**: Optional parameter to skip errors and extract what's possible
680
- 4. **Validation**: Verify checksums and data integrity
681
-
682
- ## Performance Considerations
683
-
684
- 1. **Buffer Sizes**: Default 4KB buffers, configurable
685
- 2. **Memory Usage**: Stream-based processing, avoid loading entire files
686
- 3. **Lookup Tables**: Pre-computed Huffman decode tables
687
- 4. **Ruby Optimization**:
688
- - Use byte arrays instead of strings where appropriate
689
- - Minimize object allocation in hot paths
690
- - Use bitwise operations efficiently
691
-
692
- ## Documentation
693
-
694
- ### README.adoc Structure
695
-
696
- ```asciidoc
697
- = Cabriolet
698
-
699
- Pure Ruby implementation of Microsoft CAB file extraction.
700
-
701
- == Features
702
-
703
- * Full CAB format support
704
- * Multiple compression algorithms
705
- * No C extensions required
706
- * CLI tool included
707
-
708
- == Installation
709
-
710
- == Usage
711
-
712
- === Library
713
-
714
- === Command Line
715
-
716
- == Architecture
717
-
718
- == Development
719
-
720
- == License
721
- ```
722
-
723
- ### Documentation
724
-
725
- See [`DOCUMENTATION_PLAN.md`](DOCUMENTATION_PLAN.md:1) for complete documentation architecture.
726
-
727
- **Documentation Structure**:
728
- - `docs/getting-started/` - Installation, quick start, first extraction
729
- - `docs/user-guide/` - Basic usage, advanced usage, CLI/API reference
730
- - `docs/formats/` - CAB format, compression algorithms (MSZIP, LZX, Quantum, LZSS)
731
- - `docs/technical/` - Architecture, system abstraction, binary I/O, Huffman coding
732
- - `docs/developer/` - Contributing, code style, testing, extending
733
- - `docs/appendix/` - Glossary, CAB spec, troubleshooting, FAQ
734
-
735
- **Standard Document Format**:
736
- Every document follows: Purpose → References → Concepts → Body → Bibliography
737
-
738
- **Cross-Cutting Documentation**:
739
- - Common options shared between CLI and API documented once
740
- - Each compression format gets detailed explanation
741
- - Progressive disclosure: basic → intermediate → advanced
742
-
743
- ## Dependencies
744
-
745
- ### Runtime
746
-
747
- - `bindata` (~> 2.5) - For binary data structures
748
- - `thor` (~> 1.3) - For CLI
749
-
750
- ### Development
751
-
752
- - `rspec` - Testing framework
753
- - `rake` - Build tool
754
- - `rubocop` - Code style
755
- - `yard` - Documentation
756
-
757
- ## Licensing
758
-
759
- ### Cabriolet License
760
-
761
- **BSD 3-Clause License**
762
-
763
- The Cabriolet gem itself is released under the BSD 3-Clause License, allowing:
764
- - Commercial use
765
- - Modification
766
- - Distribution
767
- - Private use
768
-
769
- With conditions:
770
- - License and copyright notice must be included
771
- - No liability or warranty
772
-
773
- ### Test Fixtures License
774
-
775
- The test fixtures in `spec/fixtures/libmspack/` are from the libmspack project and remain under the **LGPL 2.1** license. These are used solely for testing and validation purposes and are not distributed as part of the gem's runtime code.
776
-
777
- A `spec/fixtures/libmspack/README.adoc` file will acknowledge:
778
- - Copyright by Stuart Caie and libmspack contributors
779
- - LGPL 2.1 licensing of test files
780
- - Gratitude to the libmspack project for excellent test coverage
781
-
782
- ### Implementation Notes
783
-
784
- This is a clean-room implementation based on:
785
- 1. Public CAB file format specifications (Microsoft documentation)
786
- 2. Algorithm specifications (LZX, deflate/RFC 1951, etc.)
787
- 3. Test-driven development using publicly available test files
788
-
789
- The implementation does not copy code from libmspack but reimplements the algorithms in Ruby based on specifications and format documentation.
790
-
791
- ## Success Criteria
792
-
793
- 1. Successfully extract all test CAB files from libmspack test suite
794
- 2. Handle all compression methods (MSZIP, LZX, Quantum, LZSS, None)
795
- 3. Support multi-part cabinet sets
796
- 4. Achieve reasonable performance (within 3-5x of native C implementation)
797
- 5. Zero C extension dependencies
798
- 6. Comprehensive test coverage (>90%)
799
- 7. Well-documented API and CLI
data/CHANGELOG.md DELETED
@@ -1,44 +0,0 @@
1
- # Changelog
2
-
3
- All notable changes to this project will be documented in this file.
4
-
5
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
-
8
- ## [Unreleased]
9
-
10
- ### Added
11
-
12
- - Initial project structure and architecture
13
- - System abstraction layer (IOSystem, FileHandle, MemoryHandle)
14
- - Binary I/O utilities (BinData structures, Bitstream)
15
- - Domain models (Cabinet, Folder, File)
16
- - Decompressor base classes and stubs
17
- - CAB parser and extractor framework
18
- - CLI tool with Thor
19
- - Comprehensive documentation plan
20
- - Test infrastructure with RSpec
21
-
22
- ### Changed
23
-
24
- - Nothing yet
25
-
26
- ### Deprecated
27
-
28
- - Nothing yet
29
-
30
- ### Removed
31
-
32
- - Nothing yet
33
-
34
- ### Fixed
35
-
36
- - Nothing yet
37
-
38
- ### Security
39
-
40
- - Nothing yet
41
-
42
- ## [0.1.0] - TBD
43
-
44
- - Initial release (planned)