bindata 0.11.0 → 0.11.1

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of bindata might be problematic. Click here for more details.

data/ChangeLog CHANGED
@@ -1,5 +1,11 @@
1
1
  = BinData Changelog
2
2
 
3
+ == Version 0.11.1 (2009-08-29)
4
+
5
+ * Allow wrapped types to work with struct's :onlyif parameter
6
+ * Use Array#index instead of #find_index for compatibility with ruby 1.8.6
7
+ (patch courtesy of Joe Rozner).
8
+
3
9
  == Version 0.11.0 (2009-06-28)
4
10
 
5
11
  * Sanitizing code was refactored for speed.
data/TODO CHANGED
@@ -1,5 +1,22 @@
1
+ == Documentation
2
+
1
3
  * Write a detailed tutorial (probably as a web page).
2
4
 
3
5
  * Need more examples.
4
6
 
7
+ == Pending refactorings
8
+
9
+ * Refactor registry into registry and numeric registry
10
+ update specs accordingly
11
+
5
12
  * Perhaps refactor snapshot -> _snapshot et al
13
+
14
+ == Bugs
15
+
16
+ class A < BinData::Record
17
+ array :a, :type => :unknown
18
+ end
19
+
20
+ should give correct error message
21
+
22
+
data/TUTORIAL ADDED
@@ -0,0 +1,949 @@
1
+ = BinData
2
+
3
+ A declarative way to read and write structured binary data.
4
+
5
+ == What is it for?
6
+
7
+ Do you ever find yourself writing code like this?
8
+
9
+ io = File.open(...)
10
+ len = io.read(2).unpack("v")[0]
11
+ name = io.read(len)
12
+ width, height = io.read(8).unpack("VV")
13
+ puts "Rectangle #{name} is #{width} x #{height}"
14
+
15
+ It's ugly, violates DRY and feels like you're writing Perl, not Ruby.
16
+
17
+ There is a better way.
18
+
19
+ class Rectangle < BinData::Record
20
+ endian :little
21
+ uint16 :len
22
+ string :name, :read_length => :len
23
+ uint32 :width
24
+ uint32 :height
25
+ end
26
+
27
+ io = File.open(...)
28
+ r = Rectangle.read(io)
29
+ puts "Rectangle #{r.name} is #{r.width} x #{r.height}"
30
+
31
+ BinData makes it easy to specify the structure of the data you are
32
+ manipulating.
33
+
34
+ Read on for the tutorial, or go straight to the
35
+ download[http://rubyforge.org/frs/?group_id=3252] page.
36
+
37
+ == License
38
+
39
+ BinData is released under the same license as Ruby.
40
+
41
+ Copyright (c) 2007 - 2009 Dion Mendel
42
+
43
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
44
+
45
+ == Overview
46
+
47
+ BinData declarations are easy to read. Here's an example.
48
+
49
+ class MyFancyFormat < BinData::Record
50
+ stringz :comment
51
+ uint8 :num_ints, :check_value => lambda { value.even? }
52
+ array :some_ints, :type => :int32be, :initial_length => :num_ints
53
+ end
54
+
55
+ This fancy format describes the following collection of data:
56
+
57
+ 1. A zero terminated string
58
+ 2. An unsigned 8bit integer which must by even
59
+ 3. A sequence of unsigned 32bit integers in big endian form, the total
60
+ number of which is determined by the value of the 8bit integer.
61
+
62
+ The BinData declaration matches the english description closely. Compare
63
+ the above declaration with the equivalent #unpack code to read such a data
64
+ record.
65
+
66
+ def read_fancy_format(io)
67
+ comment, num_ints, rest = io.read.unpack("Z*Ca*")
68
+ raise ArgumentError, "ints must be even" unless num_ints.even?
69
+ some_ints = rest.unpack("N#{num_ints}")
70
+ {:comment => comment, :num_ints => num_ints, :some_ints => *some_ints}
71
+ end
72
+
73
+ The BinData declaration clearly shows the structure of the record. The
74
+ #unpack code makes this structure opaque.
75
+
76
+ The general usage of BinData is to declare a structured collection of data
77
+ as a user defined record. This record can be instantiated, read, written
78
+ and manipulated without the user having to be concerned with the underlying
79
+ binary representation of the data.
80
+
81
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
82
+
83
+ == Primitive Types
84
+
85
+ BinData provides support for the most commonly used primitive types that
86
+ are used when working with binary data. Namely:
87
+
88
+ * fixed size strings
89
+ * zero terminated strings
90
+ * byte based integers - signed or unsigned, big or little endian and of
91
+ any size
92
+ * bit based integers - unsigned big or little endian integers of any size
93
+ * floating point numbers - single or double precision floats in either
94
+ big or little endian
95
+
96
+ Primitives may be manipulated individually, but is more common to work
97
+ with them as part of a record.
98
+
99
+ Examples of individual usage:
100
+
101
+ int16 = BinData::Int16be.new
102
+ int16.value = 941
103
+ int16.to_binary_s #=> "\003\255"
104
+
105
+ fl = BinData::FloatBe.read("\100\055\370\124") #=> 2.71828174591064
106
+ fl.num_bytes #=> 4
107
+
108
+ fl * int16 #=> 2557.90320057996
109
+
110
+ There are several parameters that are specific to primitives.
111
+
112
+ :initial_value
113
+
114
+ This contains the initial value that the primitive will contain after
115
+ initialization. This is useful for setting default values.
116
+
117
+ obj = BinData::String.new(:initial_value => "hello ")
118
+ obj + "world" #=> "hello world"
119
+
120
+ obj.assign("good-bye " )
121
+ obj + "world" #=> "good-bye world"
122
+
123
+ :value
124
+
125
+ The primitive will always contain this value. Reading or assigning will
126
+ not change the value. This parameter is used to define constants or
127
+ dependent fields.
128
+
129
+ pi = BinData::FloatLe.new(:value => Math::PI)
130
+ pi.assign(3)
131
+ puts pi #=> 3.14159265358979
132
+
133
+ :check_value
134
+
135
+ When reading, will raise a ValidityError if the value read does not match
136
+ the value of this parameter.
137
+
138
+ obj = BinData::String.new(:check_value => lambda { /aaa/ =~ value })
139
+ obj.read("baaa!") #=> "baaa!"
140
+ obj.read("bbb") #=> raises ValidityError
141
+
142
+ obj = BinData::String.new(:check_value => "foo")
143
+ obj.read("foo") #=> "foo"
144
+ obj.read("bar") #=> raises ValidityError
145
+
146
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
147
+
148
+ == Common Operations
149
+
150
+ There are operations common to all BinData types, including user defined
151
+ ones. There are summarised here.
152
+
153
+ Reading and writing
154
+
155
+ ::read(io)
156
+
157
+ Creates a BinData object and reads its value from the given string or IO.
158
+ The newly created object is returned.
159
+
160
+ str = BinData::Stringz::read("string1\0string2")
161
+ str.snapshot #=> "string1"
162
+
163
+ #read(io)
164
+
165
+ Reads and assigns binary data read from io.
166
+
167
+ obj = BinData::Uint16be.new
168
+ obj.read("\022\064")
169
+ obj.value #=> 4660
170
+
171
+ #write(io)
172
+
173
+ Writes the binary representation of the object to io.
174
+
175
+ File.open("...", "wb") do |io|
176
+ obj = BinData::Uint64be.new
177
+ obj.value = 568290145640170
178
+ obj.write(io)
179
+ end
180
+
181
+ #to_binary_s
182
+
183
+ Returns the binary representation of this object as a string.
184
+
185
+ obj = BinData::Uint16be.new
186
+ obj.assign(4660)
187
+ obj.to_binary_s #=> "\022\064"
188
+
189
+ Manipulating
190
+
191
+ #assign(value)
192
+
193
+ Assigns the given value to this object. value can be of the same format
194
+ as produced by #snapshot, or it can be a compatible data object.
195
+
196
+ arr = BinData::Array.new(:type => :uint8)
197
+ arr.assign([1, 2, 3, 4])
198
+ arr.snapshot #=> [1, 2, 3, 4]
199
+
200
+ #clear
201
+
202
+ Resets this object to its initial state.
203
+
204
+ obj = BinData::Int32be.new(:initial_value => 42)
205
+ obj.assign(50)
206
+ obj.clear
207
+ obj.value #=> 42
208
+
209
+ #clear?
210
+
211
+ Returns whether this object is in its initial state.
212
+
213
+ arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
214
+ arr[3] = 42
215
+ arr.clear? #=> false
216
+
217
+ arr[3].clear
218
+ arr.clear? #=> true
219
+
220
+ Inspecting
221
+
222
+ #num_bytes
223
+
224
+ Returns the number of bytes required for the binary representation of
225
+ this object.
226
+
227
+ arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
228
+ arr[0].num_bytes #=> 2
229
+ arr.num_bytes #=> 10
230
+
231
+ #snapshot
232
+
233
+ Returns the value of this object as primitive Ruby objects (numerics,
234
+ strings, arrays and hashs). This may be useful for serialization or
235
+ reducing memory usage.
236
+
237
+ obj = BinData::Uint8.new
238
+ obj.assign(3)
239
+ obj + 3 #=> 6
240
+
241
+ obj.snapshot #=> 3
242
+ obj.snapshot.class #=> Fixnum
243
+
244
+ #offset
245
+
246
+ Returns the offset of this object with respect to the parent structure
247
+ it is contained within. This is most likely to be used with arrays and
248
+ records.
249
+
250
+ arr = BinData::Array.new(:type => :uint16le, :initial_length => 5)
251
+ arr[2].offset #=> 4
252
+
253
+ #inspect
254
+
255
+ Returns a human readable representation of this object. This is a
256
+ shortcut to #snapshot.inspect.
257
+
258
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
259
+
260
+ == Records
261
+
262
+ The general format of a BinData record declaration is a class containing
263
+ one or more fields.
264
+
265
+ class MyName < BinData::Record
266
+ type field_name, :param1 => "foo", :param2 => bar, ...
267
+ ...
268
+ end
269
+
270
+ *type* is the name of a supplied type (e.g. +uint32be+, +string+, +array+)
271
+ or a user defined type. For user defined types, the class name is
272
+ converted from CamelCase to lowercase underscore_style.
273
+
274
+ *field_name* is the name by which you can access the data. Use either a
275
+ String or a Symbol.
276
+
277
+ Each field may have optional *parameters* for how to process the data. The
278
+ parameters are passed as a Hash with Symbols for keys. Parameters are
279
+ designed to be lazily evaluated, possibly multiple times. This means that any
280
+ parameter value must not have side effects.
281
+
282
+ Here are some examples of legal values for parameters.
283
+
284
+ * :param => 5
285
+ * :param => lambda { 5 + 2 }
286
+ * :param => lambda { foo + 2 }
287
+ * :param => :foo
288
+
289
+ The simplest case is when the value is a literal value, such as 5.
290
+
291
+ If the value is not a literal, it is expected to be a lambda. The lambda
292
+ will be evaluated in the context of the parent, in this case the parent is
293
+ an instance of +MyName+.
294
+
295
+ If the value is a symbol, it is taken as syntactic sugar for a lambda
296
+ containing the value of the symbol.
297
+ e.g <tt>:param => :foo</tt> is <tt>:param => lambda { foo }</tt>
298
+
299
+ === Specifying default endian
300
+
301
+ The endianess of numeric types must be explicitly defined so that the code
302
+ produced is independent of architecture. However, explicitly specifying
303
+ the endian for each numeric field can result in a bloated declaration that
304
+ can be difficult to read.
305
+
306
+ class A < BinData::Record
307
+ int16be :a
308
+ int32be :b
309
+ int16le :c # <-- Note little endian!
310
+ int32be :d
311
+ float_be :e
312
+ array :f, :type => :uint32be
313
+ end
314
+
315
+ The endian keyword can be used to set the default endian. This makes the
316
+ declaration easier to read. Any numeric field that doesn't use the default
317
+ endian can explicitly override it.
318
+
319
+ class A < BinData::Record
320
+ endian :big
321
+
322
+ int16 :a
323
+ int32 :b
324
+ int16le :c # <-- Note how this little endian now stands out
325
+ int32 :d
326
+ float :e
327
+ array :f, :type => :uint32
328
+ end
329
+
330
+ The increase in clarity can be seen with the above example. The endian
331
+ keyword will cascade to nested types, as illustrated with the array in the
332
+ above example.
333
+
334
+ === Optional fields
335
+
336
+ A record may contain optional fields. The optional state of a field is decided
337
+ by the :onlyif parameter. If the value of this parameter is false, then the
338
+ field will be as if it didn't exist in the record.
339
+
340
+ class RecordWithOptionalField < BinData::Record
341
+ ...
342
+ uint8 :comment_flag
343
+ string :comment, :length => 20, :onlyif => :has_comment?
344
+
345
+ def has_comment?
346
+ comment_flag.nonzero?
347
+ end
348
+ end
349
+
350
+ In the above example, the comment field is only included in the record if the
351
+ value of the comment_flag field is non zero.
352
+
353
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
354
+
355
+ == Handling dependencies between fields
356
+
357
+ A common occurance in binary file formats is one field depending upon the
358
+ value of another. e.g. A string preceded by it's length.
359
+
360
+ As an example, let's assume a Pascal style string where the byte preceding
361
+ the string contains the string's length.
362
+
363
+ # reading
364
+ io = File.open(...)
365
+ len = io.getc
366
+ str = io.read(len)
367
+ puts "string is " + str
368
+
369
+ # writing
370
+ io = File.open(...)
371
+ str = "this is a string"
372
+ io.putc(str.length)
373
+ io.write(str)
374
+
375
+ Here's how we'd implement the same example with BinData.
376
+
377
+ class PascalString < BinData::Record
378
+ uint8 :len, :value => lambda { data.length }
379
+ string :data, :read_length => :len
380
+ end
381
+
382
+ # reading
383
+ io = File.open(...)
384
+ ps = PascalString.new
385
+ ps.read(io)
386
+ puts "string is " + ps.data
387
+
388
+ # writing
389
+ io = File.open(...)
390
+ ps = PascalString.new
391
+ ps.data = "this is a string"
392
+ ps.write(io)
393
+
394
+ This syntax needs explaining. Let's simplify by examining reading and
395
+ writing separately.
396
+
397
+ class PascalStringReader < BinData::Record
398
+ uint8 :len
399
+ string :data, :read_length => :len
400
+ end
401
+
402
+ This states that when reading the string, the initial length of the string
403
+ (and hence the number of bytes to read) is determined by the value of the
404
+ +len+ field.
405
+
406
+ Note that <tt>:read_length => :len</tt> is syntactic sugar for
407
+ <tt>:read_length => lambda { len }</tt>, as described in the Record section.
408
+
409
+ class PascalStringWriter < BinData::Record
410
+ uint8 :len, :value => lambda { data.length }
411
+ string :data
412
+ end
413
+
414
+ This states that the value of +len+ is always equal to the length of +data+.
415
+ +len+ may not be manually modified.
416
+
417
+ Combining these two definitions gives the definition for +PascalString+ as
418
+ previously defined.
419
+
420
+ It is important to note with dependencies, that a field can only depend on one
421
+ before it. You can't have a string which has the characters first and the
422
+ length afterwards.
423
+
424
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
425
+
426
+ == Numerics
427
+
428
+ There are three kinds of numeric types that are supported by BinData.
429
+
430
+ === Byte based integers
431
+
432
+ These are the common integers that are used in most low level programming
433
+ languages (C, C++, Java etc). These integers can be signed or unsigned.
434
+ The endian must be specified so that the conversion is independent of
435
+ architecture. The bit size of these integers must be a multiple of 8.
436
+ Examples of byte based integers are:
437
+
438
+ * uint16be - unsigned 16 bit big endian integer
439
+ * int8 - signed 8 bit integer
440
+ * int32le - signed 32 bit little endian integer
441
+ * uint40be - unsigned 40 bit big endian integer
442
+
443
+ The be | le suffix may be omitted if the endian keyword is in use.
444
+
445
+ === Bit based integers
446
+
447
+ These unsigned integers are used to define bitfields in records. Bitfields
448
+ are big endian by default but little endian may be specified explicitly.
449
+ Little endian bitfields are rare, but do occur in older file formats
450
+ (e.g. The file allocation table for FAT12 filesystems is stored as an
451
+ array of 12bit little endian integers).
452
+
453
+ An array of bit based integers will be packed according to their endian.
454
+
455
+ In a record, adjacent bitfields will be packed according to their endian.
456
+ All other fields are byte aligned.
457
+
458
+ Examples of bit based integers are:
459
+
460
+ * bit1 - 1 bit big endian integer (may be used as boolean)
461
+ * bit4_le - 4 bit little endian integer
462
+ * bit32 - 32 bit big endian integer
463
+
464
+ The difference between byte and bit base integers of the same number of
465
+ bits (e.g. uint8 vs bit8) is one of alignment.
466
+
467
+ This example is packed as 3 bytes
468
+
469
+ class A < BinData::Record
470
+ bit4 :a
471
+ uint8 :b
472
+ bit4 :c
473
+ end
474
+
475
+ Data is stored as: AAAA0000 BBBBBBBB CCCC0000
476
+
477
+ Whereas this example is packed into only 2 bytes
478
+
479
+ class B < BinData::Record
480
+ bit4 :a
481
+ bit8 :b
482
+ bit4 :c
483
+ end
484
+
485
+ Data is stored as: AAAABBBB BBBBCCCC
486
+
487
+ === Floating point numbers
488
+
489
+ BinData supports 32 and 64 bit floating point numbers, in both big and
490
+ little endian format. These types are:
491
+
492
+ * float_le - single precision 32 bit little endian float
493
+ * float_be - single precision 32 bit big endian float
494
+ * double_le - double precision 64 bit little endian float
495
+ * double_be - double precision 64 bit big endian float
496
+
497
+ The _be | _le suffix may be omitted if the endian keyword is in use.
498
+
499
+ == Example
500
+
501
+ Here is an example declaration for an Internet Protocol network packet.
502
+ Three of the fields have parameters.
503
+
504
+ * The version field always has the value 4, as per the standard.
505
+ * The options field is read as a raw string, but not processed.
506
+ * The data field contains the payload of the packet. Its length is
507
+ calculated as the total length of the packet minus the length of the
508
+ header.
509
+
510
+ class IP_PDU < BinData::Record
511
+ endian :big
512
+
513
+ bit4 :version, :value => 4
514
+ bit4 :header_length
515
+ uint8 :tos
516
+ uint16 :total_length
517
+ uint16 :ident
518
+ bit3 :flags
519
+ bit13 :frag_offset
520
+ uint8 :ttl
521
+ uint8 :protocol
522
+ uint16 :checksum
523
+ uint32 :src_addr
524
+ uint32 :dest_addr
525
+ string :options, :read_length => :options_length_in_bytes
526
+ string :data, :read_length => lambda { total_length - header_length_in_bytes }
527
+
528
+ def header_length_in_bytes
529
+ header_length * 4
530
+ end
531
+
532
+ def options_length_in_bytes
533
+ header_length_in_bytes - 20
534
+ end
535
+ end
536
+
537
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
538
+
539
+ == Strings
540
+
541
+ BinData supports two types of strings - fixed size and zero terminated.
542
+ Strings are treated as a sequence of 8bit bytes. This is the same as
543
+ strings in Ruby 1.8. The issue of character encoding is ignored by
544
+ BinData.
545
+
546
+ === Fixed Sized Strings
547
+
548
+ Fixed sized strings may have a set length. If an assigned value is shorter
549
+ than this length, it will be padded to this length. If no length is set,
550
+ the length is taken to be the length of the assigned value.
551
+
552
+ There are several parameters that are specific to fixed sized strings.
553
+
554
+ :read_length
555
+
556
+ The length to use when reading a value.
557
+
558
+ obj = BinData::String.new(:read_length => 5)
559
+ obj.read("abcdefghij")
560
+ obj.value #=> "abcde"
561
+
562
+ :length
563
+
564
+ The fixed length of the string. If a shorter string is set, it will be
565
+ padded to this length. Longer strings will be truncated.
566
+
567
+ obj = BinData::String.new(:length => 6)
568
+ obj.read("abcdefghij")
569
+ obj.value #=> "abcdef"
570
+
571
+ obj = BinData::String.new(:length => 6)
572
+ obj.value = "abcd"
573
+ obj.value #=> "abcd\000\000"
574
+
575
+ obj = BinData::String.new(:length => 6)
576
+ obj.value = "abcdefghij"
577
+ obj.value #=> "abcdef"
578
+
579
+ :pad_char
580
+
581
+ The character to use when padding a string to a set length. Valid values
582
+ are Integers and Strings of length 1. "\0" is the default.
583
+
584
+ obj = BinData::String.new(:length => 6, :pad_char => 'A')
585
+ obj.value = "abcd"
586
+ obj.value #=> "abcdAA"
587
+ obj.to_binary_s #=> "abcdAA"
588
+
589
+ :trim_padding
590
+
591
+ Boolean, default false. If set, the value of this string will have all
592
+ pad_chars trimmed from the end of the string. The value will not be
593
+ trimmed when writing.
594
+
595
+ obj = BinData::String.new(:length => 6, :trim_value => true)
596
+ obj.value = "abcd"
597
+ obj.value #=> "abcd"
598
+ obj.to_binary_s #=> "abcd\000\000"
599
+
600
+ === Zero Terminated Strings
601
+
602
+ These strings are modelled on the C style of string - a sequence of
603
+ bytes terminated by a null ("\0") character.
604
+
605
+ obj = BinData::Stringz.new
606
+ obj.read("abcd\000efgh")
607
+ obj.value #=> "abcd"
608
+ obj.num_bytes #=> 5
609
+ obj.to_binary_s #=> "abcd\000"
610
+
611
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
612
+
613
+ == Arrays
614
+
615
+ A BinData array is a list of data objects of the same type. It behaves
616
+ much the same as the standard Ruby array, supporting most of the common
617
+ methods.
618
+
619
+ When instantiating an array, the type of object it contains must be
620
+ specified.
621
+
622
+ arr = BinData::Array.new(:type => :uint8)
623
+ arr[3] = 5
624
+ arr.snapshot #=> [0, 0, 0, 5]
625
+
626
+ Parameters can be passed to this object with a slightly clumsy syntax.
627
+
628
+ arr = BinData::Array.new(:type => [:uint8, {:initial_value => :index}])
629
+ arr[3] = 5
630
+ arr.snapshot #=> [0, 1, 2, 5]
631
+
632
+ There are two different parameters that specify the length of the array.
633
+
634
+ :initial_length
635
+
636
+ obj = BinData::Array.new(:type => :int8, :initial_length => 4)
637
+ obj.read("\002\003\004\005\006\007")
638
+ obj.snapshot #=> [2, 3, 4, 5]
639
+
640
+ :read_until
641
+
642
+ While reading, elements are read until this condition is true. This is
643
+ typically used to read an array until a sentinel value is found. The
644
+ variables +index+, +element+ and +array+ are made available to any lambda
645
+ assigned to this parameter. If the value of this parameter is the symbol
646
+ :eof, then the array will read as much data from the stream as possible.
647
+
648
+ obj = BinData::Array.new(:type => :int8,
649
+ :read_until => lambda { index == 1 })
650
+ obj.read("\002\003\004\005\006\007")
651
+ obj.snapshot #=> [2, 3]
652
+
653
+ obj = BinData::Array.new(:type => :int8,
654
+ :read_until => lambda { element >= 3.5 })
655
+ obj.read("\002\003\004\005\006\007")
656
+ obj.snapshot #=> [2, 3, 4]
657
+
658
+ obj = BinData::Array.new(:type => :int8,
659
+ :read_until => lambda { array[index] + array[index - 1] == 9 })
660
+ obj.read("\002\003\004\005\006\007")
661
+ obj.snapshot #=> [2, 3, 4, 5]
662
+
663
+ obj = BinData::Array.new(:type => :int8, :read_until => :eof)
664
+ obj.read("\002\003\004\005\006\007")
665
+ obj.snapshot #=> [2, 3, 4, 5, 6, 7]
666
+
667
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
668
+
669
+ == Offset checking / adjustment
670
+
671
+ TODO
672
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
673
+
674
+ == Choices
675
+
676
+ A Choice is a collection of data objects of which only one is active at any
677
+ particular time. Method calls will be delegated to the active choice.
678
+ The possible types of objects that a choice contains is controlled by the
679
+ :choices parameter, while the :selection parameter specifies the active
680
+ choice.
681
+
682
+ :choices
683
+
684
+ Either an array or a hash specifying the possible data objects. The
685
+ format of the array/hash.values is a list of symbols representing the
686
+ data object type. If a choice is to have params passed to it, then it
687
+ should be provided as [type_symbol, hash_params]. An implementation
688
+ constraint is that the hash may not contain symbols as keys.
689
+
690
+ :selection
691
+
692
+ An index/key into the :choices array/hash which specifies the currently
693
+ active choice.
694
+
695
+ :copy_on_change
696
+
697
+ If set to true, copy the value of the previous selection to the current
698
+ selection whenever the selection changes. Default is false.
699
+
700
+ Examples
701
+
702
+ type1 = [:string, {:value => "Type1"}]
703
+ type2 = [:string, {:value => "Type2"}]
704
+
705
+ choices = {5 => type1, 17 => type2}
706
+ obj = BinData::Choice.new(:choices => choices, :selection => 5)
707
+ obj.value # => "Type1"
708
+
709
+ choices = [ type1, type2 ]
710
+ obj = BinData::Choice.new(:choices => choices, :selection => 1)
711
+ obj.value # => "Type2"
712
+
713
+ choices = [ nil, nil, nil, type1, nil, type2 ]
714
+ obj = BinData::Choice.new(:choices => choices, :selection => 3)
715
+ obj.value # => "Type1"
716
+
717
+ class MyNumber < BinData::Record
718
+ int8 :is_big_endian
719
+ choice :data, :choices => { true => :int32be, false => :int32le },
720
+ :selection => lambda { is_big_endian != 0 },
721
+ :copy_on_change => true
722
+ end
723
+
724
+ obj = MyNumber.new
725
+ obj.is_big_endian = 1
726
+ obj.data = 5
727
+ obj.to_binary_s #=> "\001\000\000\000\005"
728
+
729
+ obj.is_big_endian = 0
730
+ obj.to_binary_s #=> "\000\005\000\000\000"
731
+
732
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
733
+
734
+ == Wrappers
735
+
736
+ Sometimes you wish to create a new type that is simply an existing type
737
+ with some predefined parameters. Examples could be an array with a
738
+ specified type, or an integer with an initial value.
739
+
740
+ This can be achieved with a wrapper. A wrapper creates a new type based on
741
+ an existing type which has predefined parameters. These parameters can of
742
+ course be overridden at initialisation time.
743
+ Here we define an array that contains big endian 16 bit integers. The
744
+ array has a preferred initial length.
745
+
746
+ class IntArray < BinData::Wrapper
747
+ endian :big
748
+ array :type => :uint16, :initial_length => 5
749
+ end
750
+
751
+ arr = IntArray.new
752
+ arr.size #=> 5
753
+
754
+ The initial length can be overridden at initialisation time.
755
+
756
+ arr = IntArray.new(:initial_length => 8)
757
+ arr.size #=> 8
758
+
759
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
760
+
761
+ == Parameterizing User Defined Types
762
+
763
+ All BinData types have parameters that allow the behaviour of an object to
764
+ be specified at initialization time. User defined types may also specify
765
+ parameters. There are two types of parameters - mandatory and default.
766
+
767
+ === Mandatory Parameters
768
+
769
+ Mandatory parameters must be specified when creating an instance of the
770
+ type. The :type parameter of Array is an example of a mandatory type.
771
+
772
+ class IntArray < BinData::Wrapper
773
+ mandatory_parameter :half_count
774
+
775
+ array :type => :uint8, :initial_length => lambda { half_count * 2 }
776
+ end
777
+
778
+ arr = IntArray.new #=> raises ArgumentError: parameter 'half_count' must
779
+ be specified in IntArray
780
+
781
+ arr = IntArray.new(:half_count => lambda { 1 + 2 })
782
+ arr.snapshot #=> [0, 0, 0, 0, 0, 0]
783
+
784
+ === Default Parameters
785
+
786
+ Default parameters are optional. These parameters have a default value
787
+ that may be overridden when an instance of the type is created.
788
+
789
+ class Phrase < BinData::Primitive
790
+ default_parameter :number => "three"
791
+ default_parameter :adjective => "blind"
792
+ default_parameter :noun => "mice"
793
+
794
+ stringz :a, :initial_value => :number
795
+ stringz :b, :initial_value => :adjective
796
+ stringz :c, :initial_value => :noun
797
+
798
+ def get; "#{a} #{b} #{c}"; end
799
+ def set(v)
800
+ if /(.*) (.*) (.*)/ =~ v
801
+ self.a, self.b, self.c = $1, $2, $3
802
+ end
803
+ end
804
+ end
805
+
806
+ obj = Phrase.new(:number => "two", :adjective => "deaf")
807
+ obj.to_s #=> "two deaf mice"
808
+
809
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
810
+
811
+ == User Defined Primitive Types
812
+
813
+ Most user defined types will be Records, but occasionally we'd like to
814
+ create a custom type of primitive.
815
+
816
+ Let us revisit the Pascal String example.
817
+
818
+ class PascalString < BinData::Record
819
+ uint8 :len, :value => lambda { data.length }
820
+ string :data, :read_length => :len
821
+ end
822
+
823
+ We'd like to make PascalString a user defined type that behaves like a
824
+ BinData::BasePrimitive object so we can use :initial_value etc. Here's an
825
+ example usage of what we'd like:
826
+
827
+ class Favourites < BinData::Record
828
+ pascal_string :language, :initial_value => "ruby"
829
+ pascal_string :os, :initial_value => "unix"
830
+ end
831
+
832
+ f = Favourites.new
833
+ f.os = "freebsd"
834
+ f.to_binary_s #=> "\004ruby\007freebsd"
835
+
836
+ We create this type of custom string by inheriting from BinData::Primitive
837
+ (instead of BinData::Record) and implementing the #get and #set methods.
838
+
839
+ class PascalString < BinData::Primitive
840
+ uint8 :len, :value => lambda { data.length }
841
+ string :data, :read_length => :len
842
+
843
+ def get; self.data; end
844
+ def set(v) self.data = v; end
845
+ end
846
+
847
+ === Advanced User Defined Primitive Types
848
+
849
+ Sometimes a user defined primitive type can not easily be declaratively
850
+ defined. In this case you should inherit from BinData::BasePrimitive and
851
+ implement the following three methods:
852
+
853
+ * value_to_binary_string(value)
854
+ * read_and_return_value(io)
855
+ * sensible_default()
856
+
857
+ # A custom big integer format. Binary format is:
858
+ # 1 byte : 0 for positive, non zero for negative
859
+ # x bytes : Little endian stream of 7 bit bytes representing the
860
+ # positive form of the integer. The upper bit of each byte
861
+ # is set when there are more bytes in the stream.
862
+ class BigInteger < BinData::BasePrimitive
863
+ def value_to_binary_string(value)
864
+ negative = (value < 0) ? 1 : 0
865
+ value = value.abs
866
+ bytes = [negative]
867
+ loop do
868
+ seven_bit_byte = value & 0x7f
869
+ value >>= 7
870
+ has_more = value.nonzero? ? 0x80 : 0
871
+ byte = has_more | seven_bit_byte
872
+ bytes.push(byte)
873
+
874
+ break if has_more.zero?
875
+ end
876
+
877
+ bytes.inject("") { |str, b| str << b.chr }
878
+ end
879
+
880
+ def read_and_return_value(io)
881
+ negative = read_uint8(io).nonzero?
882
+ value = 0
883
+ bit_shift = 0
884
+ loop do
885
+ byte = read_uint8(io)
886
+ has_more = byte & 0x80
887
+ seven_bit_byte = byte & 0x7f
888
+ value |= seven_bit_byte << bit_shift
889
+ bit_shift += 7
890
+
891
+ break if has_more.zero?
892
+ end
893
+
894
+ negative ? -value : value
895
+ end
896
+
897
+ def sensible_default
898
+ 0
899
+ end
900
+
901
+ def read_uint8(io)
902
+ io.readbytes(1).unpack("C").at(0)
903
+ end
904
+ end
905
+
906
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
907
+
908
+ == Debugging
909
+
910
+ TODO
911
+ === Tracing
912
+
913
+ class A < BinData::Record
914
+ int8 :a
915
+ bit4 :b
916
+ bit2 :c
917
+ array :d, :initial_length => 6, :type => :bit1
918
+ end
919
+
920
+ BinData::trace_reading do
921
+ A.read("\373\225\220")
922
+ end
923
+
924
+ obj.a => -5
925
+ obj.b => 9
926
+ obj.c => 1
927
+ obj.d[0] => 0
928
+ obj.d[1] => 1
929
+ obj.d[2] => 1
930
+ obj.d[3] => 0
931
+ obj.d[4] => 0
932
+ obj.d[5] => 1
933
+
934
+ === Rest
935
+
936
+ === Hidden fields
937
+
938
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
939
+
940
+ == Comparison
941
+
942
+ TODO
943
+ http://github.com/marcandre/packable/tree/master
944
+ http://metafuzz.rubyforge.org/binstruct/
945
+ http://rubyforge.org/projects/bitpack/
946
+ http://binaryparse.rubyforge.org/
947
+ http://redshift.sourceforge.net/bit-struct/
948
+
949
+ -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-