bindata 0.11.0 → 0.11.1
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of bindata might be problematic. Click here for more details.
- data/ChangeLog +6 -0
- data/TODO +17 -0
- data/TUTORIAL +949 -0
- data/lib/bindata.rb +1 -1
- data/lib/bindata/array.rb +4 -4
- data/lib/bindata/choice.rb +1 -1
- data/lib/bindata/deprecated.rb +2 -2
- data/lib/bindata/float.rb +5 -5
- data/lib/bindata/int.rb +33 -28
- data/lib/bindata/registry.rb +2 -2
- data/lib/bindata/sanitize.rb +1 -0
- data/lib/bindata/struct.rb +4 -4
- data/spec/array_spec.rb +1 -1
- data/spec/base_primitive_spec.rb +3 -7
- data/spec/base_spec.rb +29 -59
- data/spec/deprecated_spec.rb +4 -8
- data/spec/int_spec.rb +25 -60
- data/spec/primitive_spec.rb +37 -65
- data/spec/record_spec.rb +63 -105
- data/spec/registry_spec.rb +6 -6
- data/spec/system_spec.rb +32 -56
- data/spec/wrapper_spec.rb +28 -29
- metadata +3 -2
data/ChangeLog
CHANGED
@@ -1,5 +1,11 @@
|
|
1
1
|
= BinData Changelog
|
2
2
|
|
3
|
+
== Version 0.11.1 (2009-08-29)
|
4
|
+
|
5
|
+
* Allow wrapped types to work with struct's :onlyif parameter
|
6
|
+
* Use Array#index instead of #find_index for compatibility with ruby 1.8.6
|
7
|
+
(patch courtesy of Joe Rozner).
|
8
|
+
|
3
9
|
== Version 0.11.0 (2009-06-28)
|
4
10
|
|
5
11
|
* Sanitizing code was refactored for speed.
|
data/TODO
CHANGED
@@ -1,5 +1,22 @@
|
|
1
|
+
== Documentation
|
2
|
+
|
1
3
|
* Write a detailed tutorial (probably as a web page).
|
2
4
|
|
3
5
|
* Need more examples.
|
4
6
|
|
7
|
+
== Pending refactorings
|
8
|
+
|
9
|
+
* Refactor registry into registry and numeric registry
|
10
|
+
update specs accordingly
|
11
|
+
|
5
12
|
* Perhaps refactor snapshot -> _snapshot et al
|
13
|
+
|
14
|
+
== Bugs
|
15
|
+
|
16
|
+
class A < BinData::Record
|
17
|
+
array :a, :type => :unknown
|
18
|
+
end
|
19
|
+
|
20
|
+
should give correct error message
|
21
|
+
|
22
|
+
|
data/TUTORIAL
ADDED
@@ -0,0 +1,949 @@
|
|
1
|
+
= BinData
|
2
|
+
|
3
|
+
A declarative way to read and write structured binary data.
|
4
|
+
|
5
|
+
== What is it for?
|
6
|
+
|
7
|
+
Do you ever find yourself writing code like this?
|
8
|
+
|
9
|
+
io = File.open(...)
|
10
|
+
len = io.read(2).unpack("v")[0]
|
11
|
+
name = io.read(len)
|
12
|
+
width, height = io.read(8).unpack("VV")
|
13
|
+
puts "Rectangle #{name} is #{width} x #{height}"
|
14
|
+
|
15
|
+
It's ugly, violates DRY and feels like you're writing Perl, not Ruby.
|
16
|
+
|
17
|
+
There is a better way.
|
18
|
+
|
19
|
+
class Rectangle < BinData::Record
|
20
|
+
endian :little
|
21
|
+
uint16 :len
|
22
|
+
string :name, :read_length => :len
|
23
|
+
uint32 :width
|
24
|
+
uint32 :height
|
25
|
+
end
|
26
|
+
|
27
|
+
io = File.open(...)
|
28
|
+
r = Rectangle.read(io)
|
29
|
+
puts "Rectangle #{r.name} is #{r.width} x #{r.height}"
|
30
|
+
|
31
|
+
BinData makes it easy to specify the structure of the data you are
|
32
|
+
manipulating.
|
33
|
+
|
34
|
+
Read on for the tutorial, or go straight to the
|
35
|
+
download[http://rubyforge.org/frs/?group_id=3252] page.
|
36
|
+
|
37
|
+
== License
|
38
|
+
|
39
|
+
BinData is released under the same license as Ruby.
|
40
|
+
|
41
|
+
Copyright (c) 2007 - 2009 Dion Mendel
|
42
|
+
|
43
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
44
|
+
|
45
|
+
== Overview
|
46
|
+
|
47
|
+
BinData declarations are easy to read. Here's an example.
|
48
|
+
|
49
|
+
class MyFancyFormat < BinData::Record
|
50
|
+
stringz :comment
|
51
|
+
uint8 :num_ints, :check_value => lambda { value.even? }
|
52
|
+
array :some_ints, :type => :int32be, :initial_length => :num_ints
|
53
|
+
end
|
54
|
+
|
55
|
+
This fancy format describes the following collection of data:
|
56
|
+
|
57
|
+
1. A zero terminated string
|
58
|
+
2. An unsigned 8bit integer which must by even
|
59
|
+
3. A sequence of unsigned 32bit integers in big endian form, the total
|
60
|
+
number of which is determined by the value of the 8bit integer.
|
61
|
+
|
62
|
+
The BinData declaration matches the english description closely. Compare
|
63
|
+
the above declaration with the equivalent #unpack code to read such a data
|
64
|
+
record.
|
65
|
+
|
66
|
+
def read_fancy_format(io)
|
67
|
+
comment, num_ints, rest = io.read.unpack("Z*Ca*")
|
68
|
+
raise ArgumentError, "ints must be even" unless num_ints.even?
|
69
|
+
some_ints = rest.unpack("N#{num_ints}")
|
70
|
+
{:comment => comment, :num_ints => num_ints, :some_ints => *some_ints}
|
71
|
+
end
|
72
|
+
|
73
|
+
The BinData declaration clearly shows the structure of the record. The
|
74
|
+
#unpack code makes this structure opaque.
|
75
|
+
|
76
|
+
The general usage of BinData is to declare a structured collection of data
|
77
|
+
as a user defined record. This record can be instantiated, read, written
|
78
|
+
and manipulated without the user having to be concerned with the underlying
|
79
|
+
binary representation of the data.
|
80
|
+
|
81
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
82
|
+
|
83
|
+
== Primitive Types
|
84
|
+
|
85
|
+
BinData provides support for the most commonly used primitive types that
|
86
|
+
are used when working with binary data. Namely:
|
87
|
+
|
88
|
+
* fixed size strings
|
89
|
+
* zero terminated strings
|
90
|
+
* byte based integers - signed or unsigned, big or little endian and of
|
91
|
+
any size
|
92
|
+
* bit based integers - unsigned big or little endian integers of any size
|
93
|
+
* floating point numbers - single or double precision floats in either
|
94
|
+
big or little endian
|
95
|
+
|
96
|
+
Primitives may be manipulated individually, but is more common to work
|
97
|
+
with them as part of a record.
|
98
|
+
|
99
|
+
Examples of individual usage:
|
100
|
+
|
101
|
+
int16 = BinData::Int16be.new
|
102
|
+
int16.value = 941
|
103
|
+
int16.to_binary_s #=> "\003\255"
|
104
|
+
|
105
|
+
fl = BinData::FloatBe.read("\100\055\370\124") #=> 2.71828174591064
|
106
|
+
fl.num_bytes #=> 4
|
107
|
+
|
108
|
+
fl * int16 #=> 2557.90320057996
|
109
|
+
|
110
|
+
There are several parameters that are specific to primitives.
|
111
|
+
|
112
|
+
:initial_value
|
113
|
+
|
114
|
+
This contains the initial value that the primitive will contain after
|
115
|
+
initialization. This is useful for setting default values.
|
116
|
+
|
117
|
+
obj = BinData::String.new(:initial_value => "hello ")
|
118
|
+
obj + "world" #=> "hello world"
|
119
|
+
|
120
|
+
obj.assign("good-bye " )
|
121
|
+
obj + "world" #=> "good-bye world"
|
122
|
+
|
123
|
+
:value
|
124
|
+
|
125
|
+
The primitive will always contain this value. Reading or assigning will
|
126
|
+
not change the value. This parameter is used to define constants or
|
127
|
+
dependent fields.
|
128
|
+
|
129
|
+
pi = BinData::FloatLe.new(:value => Math::PI)
|
130
|
+
pi.assign(3)
|
131
|
+
puts pi #=> 3.14159265358979
|
132
|
+
|
133
|
+
:check_value
|
134
|
+
|
135
|
+
When reading, will raise a ValidityError if the value read does not match
|
136
|
+
the value of this parameter.
|
137
|
+
|
138
|
+
obj = BinData::String.new(:check_value => lambda { /aaa/ =~ value })
|
139
|
+
obj.read("baaa!") #=> "baaa!"
|
140
|
+
obj.read("bbb") #=> raises ValidityError
|
141
|
+
|
142
|
+
obj = BinData::String.new(:check_value => "foo")
|
143
|
+
obj.read("foo") #=> "foo"
|
144
|
+
obj.read("bar") #=> raises ValidityError
|
145
|
+
|
146
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
147
|
+
|
148
|
+
== Common Operations
|
149
|
+
|
150
|
+
There are operations common to all BinData types, including user defined
|
151
|
+
ones. There are summarised here.
|
152
|
+
|
153
|
+
Reading and writing
|
154
|
+
|
155
|
+
::read(io)
|
156
|
+
|
157
|
+
Creates a BinData object and reads its value from the given string or IO.
|
158
|
+
The newly created object is returned.
|
159
|
+
|
160
|
+
str = BinData::Stringz::read("string1\0string2")
|
161
|
+
str.snapshot #=> "string1"
|
162
|
+
|
163
|
+
#read(io)
|
164
|
+
|
165
|
+
Reads and assigns binary data read from io.
|
166
|
+
|
167
|
+
obj = BinData::Uint16be.new
|
168
|
+
obj.read("\022\064")
|
169
|
+
obj.value #=> 4660
|
170
|
+
|
171
|
+
#write(io)
|
172
|
+
|
173
|
+
Writes the binary representation of the object to io.
|
174
|
+
|
175
|
+
File.open("...", "wb") do |io|
|
176
|
+
obj = BinData::Uint64be.new
|
177
|
+
obj.value = 568290145640170
|
178
|
+
obj.write(io)
|
179
|
+
end
|
180
|
+
|
181
|
+
#to_binary_s
|
182
|
+
|
183
|
+
Returns the binary representation of this object as a string.
|
184
|
+
|
185
|
+
obj = BinData::Uint16be.new
|
186
|
+
obj.assign(4660)
|
187
|
+
obj.to_binary_s #=> "\022\064"
|
188
|
+
|
189
|
+
Manipulating
|
190
|
+
|
191
|
+
#assign(value)
|
192
|
+
|
193
|
+
Assigns the given value to this object. value can be of the same format
|
194
|
+
as produced by #snapshot, or it can be a compatible data object.
|
195
|
+
|
196
|
+
arr = BinData::Array.new(:type => :uint8)
|
197
|
+
arr.assign([1, 2, 3, 4])
|
198
|
+
arr.snapshot #=> [1, 2, 3, 4]
|
199
|
+
|
200
|
+
#clear
|
201
|
+
|
202
|
+
Resets this object to its initial state.
|
203
|
+
|
204
|
+
obj = BinData::Int32be.new(:initial_value => 42)
|
205
|
+
obj.assign(50)
|
206
|
+
obj.clear
|
207
|
+
obj.value #=> 42
|
208
|
+
|
209
|
+
#clear?
|
210
|
+
|
211
|
+
Returns whether this object is in its initial state.
|
212
|
+
|
213
|
+
arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
|
214
|
+
arr[3] = 42
|
215
|
+
arr.clear? #=> false
|
216
|
+
|
217
|
+
arr[3].clear
|
218
|
+
arr.clear? #=> true
|
219
|
+
|
220
|
+
Inspecting
|
221
|
+
|
222
|
+
#num_bytes
|
223
|
+
|
224
|
+
Returns the number of bytes required for the binary representation of
|
225
|
+
this object.
|
226
|
+
|
227
|
+
arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
|
228
|
+
arr[0].num_bytes #=> 2
|
229
|
+
arr.num_bytes #=> 10
|
230
|
+
|
231
|
+
#snapshot
|
232
|
+
|
233
|
+
Returns the value of this object as primitive Ruby objects (numerics,
|
234
|
+
strings, arrays and hashs). This may be useful for serialization or
|
235
|
+
reducing memory usage.
|
236
|
+
|
237
|
+
obj = BinData::Uint8.new
|
238
|
+
obj.assign(3)
|
239
|
+
obj + 3 #=> 6
|
240
|
+
|
241
|
+
obj.snapshot #=> 3
|
242
|
+
obj.snapshot.class #=> Fixnum
|
243
|
+
|
244
|
+
#offset
|
245
|
+
|
246
|
+
Returns the offset of this object with respect to the parent structure
|
247
|
+
it is contained within. This is most likely to be used with arrays and
|
248
|
+
records.
|
249
|
+
|
250
|
+
arr = BinData::Array.new(:type => :uint16le, :initial_length => 5)
|
251
|
+
arr[2].offset #=> 4
|
252
|
+
|
253
|
+
#inspect
|
254
|
+
|
255
|
+
Returns a human readable representation of this object. This is a
|
256
|
+
shortcut to #snapshot.inspect.
|
257
|
+
|
258
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
259
|
+
|
260
|
+
== Records
|
261
|
+
|
262
|
+
The general format of a BinData record declaration is a class containing
|
263
|
+
one or more fields.
|
264
|
+
|
265
|
+
class MyName < BinData::Record
|
266
|
+
type field_name, :param1 => "foo", :param2 => bar, ...
|
267
|
+
...
|
268
|
+
end
|
269
|
+
|
270
|
+
*type* is the name of a supplied type (e.g. +uint32be+, +string+, +array+)
|
271
|
+
or a user defined type. For user defined types, the class name is
|
272
|
+
converted from CamelCase to lowercase underscore_style.
|
273
|
+
|
274
|
+
*field_name* is the name by which you can access the data. Use either a
|
275
|
+
String or a Symbol.
|
276
|
+
|
277
|
+
Each field may have optional *parameters* for how to process the data. The
|
278
|
+
parameters are passed as a Hash with Symbols for keys. Parameters are
|
279
|
+
designed to be lazily evaluated, possibly multiple times. This means that any
|
280
|
+
parameter value must not have side effects.
|
281
|
+
|
282
|
+
Here are some examples of legal values for parameters.
|
283
|
+
|
284
|
+
* :param => 5
|
285
|
+
* :param => lambda { 5 + 2 }
|
286
|
+
* :param => lambda { foo + 2 }
|
287
|
+
* :param => :foo
|
288
|
+
|
289
|
+
The simplest case is when the value is a literal value, such as 5.
|
290
|
+
|
291
|
+
If the value is not a literal, it is expected to be a lambda. The lambda
|
292
|
+
will be evaluated in the context of the parent, in this case the parent is
|
293
|
+
an instance of +MyName+.
|
294
|
+
|
295
|
+
If the value is a symbol, it is taken as syntactic sugar for a lambda
|
296
|
+
containing the value of the symbol.
|
297
|
+
e.g <tt>:param => :foo</tt> is <tt>:param => lambda { foo }</tt>
|
298
|
+
|
299
|
+
=== Specifying default endian
|
300
|
+
|
301
|
+
The endianess of numeric types must be explicitly defined so that the code
|
302
|
+
produced is independent of architecture. However, explicitly specifying
|
303
|
+
the endian for each numeric field can result in a bloated declaration that
|
304
|
+
can be difficult to read.
|
305
|
+
|
306
|
+
class A < BinData::Record
|
307
|
+
int16be :a
|
308
|
+
int32be :b
|
309
|
+
int16le :c # <-- Note little endian!
|
310
|
+
int32be :d
|
311
|
+
float_be :e
|
312
|
+
array :f, :type => :uint32be
|
313
|
+
end
|
314
|
+
|
315
|
+
The endian keyword can be used to set the default endian. This makes the
|
316
|
+
declaration easier to read. Any numeric field that doesn't use the default
|
317
|
+
endian can explicitly override it.
|
318
|
+
|
319
|
+
class A < BinData::Record
|
320
|
+
endian :big
|
321
|
+
|
322
|
+
int16 :a
|
323
|
+
int32 :b
|
324
|
+
int16le :c # <-- Note how this little endian now stands out
|
325
|
+
int32 :d
|
326
|
+
float :e
|
327
|
+
array :f, :type => :uint32
|
328
|
+
end
|
329
|
+
|
330
|
+
The increase in clarity can be seen with the above example. The endian
|
331
|
+
keyword will cascade to nested types, as illustrated with the array in the
|
332
|
+
above example.
|
333
|
+
|
334
|
+
=== Optional fields
|
335
|
+
|
336
|
+
A record may contain optional fields. The optional state of a field is decided
|
337
|
+
by the :onlyif parameter. If the value of this parameter is false, then the
|
338
|
+
field will be as if it didn't exist in the record.
|
339
|
+
|
340
|
+
class RecordWithOptionalField < BinData::Record
|
341
|
+
...
|
342
|
+
uint8 :comment_flag
|
343
|
+
string :comment, :length => 20, :onlyif => :has_comment?
|
344
|
+
|
345
|
+
def has_comment?
|
346
|
+
comment_flag.nonzero?
|
347
|
+
end
|
348
|
+
end
|
349
|
+
|
350
|
+
In the above example, the comment field is only included in the record if the
|
351
|
+
value of the comment_flag field is non zero.
|
352
|
+
|
353
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
354
|
+
|
355
|
+
== Handling dependencies between fields
|
356
|
+
|
357
|
+
A common occurance in binary file formats is one field depending upon the
|
358
|
+
value of another. e.g. A string preceded by it's length.
|
359
|
+
|
360
|
+
As an example, let's assume a Pascal style string where the byte preceding
|
361
|
+
the string contains the string's length.
|
362
|
+
|
363
|
+
# reading
|
364
|
+
io = File.open(...)
|
365
|
+
len = io.getc
|
366
|
+
str = io.read(len)
|
367
|
+
puts "string is " + str
|
368
|
+
|
369
|
+
# writing
|
370
|
+
io = File.open(...)
|
371
|
+
str = "this is a string"
|
372
|
+
io.putc(str.length)
|
373
|
+
io.write(str)
|
374
|
+
|
375
|
+
Here's how we'd implement the same example with BinData.
|
376
|
+
|
377
|
+
class PascalString < BinData::Record
|
378
|
+
uint8 :len, :value => lambda { data.length }
|
379
|
+
string :data, :read_length => :len
|
380
|
+
end
|
381
|
+
|
382
|
+
# reading
|
383
|
+
io = File.open(...)
|
384
|
+
ps = PascalString.new
|
385
|
+
ps.read(io)
|
386
|
+
puts "string is " + ps.data
|
387
|
+
|
388
|
+
# writing
|
389
|
+
io = File.open(...)
|
390
|
+
ps = PascalString.new
|
391
|
+
ps.data = "this is a string"
|
392
|
+
ps.write(io)
|
393
|
+
|
394
|
+
This syntax needs explaining. Let's simplify by examining reading and
|
395
|
+
writing separately.
|
396
|
+
|
397
|
+
class PascalStringReader < BinData::Record
|
398
|
+
uint8 :len
|
399
|
+
string :data, :read_length => :len
|
400
|
+
end
|
401
|
+
|
402
|
+
This states that when reading the string, the initial length of the string
|
403
|
+
(and hence the number of bytes to read) is determined by the value of the
|
404
|
+
+len+ field.
|
405
|
+
|
406
|
+
Note that <tt>:read_length => :len</tt> is syntactic sugar for
|
407
|
+
<tt>:read_length => lambda { len }</tt>, as described in the Record section.
|
408
|
+
|
409
|
+
class PascalStringWriter < BinData::Record
|
410
|
+
uint8 :len, :value => lambda { data.length }
|
411
|
+
string :data
|
412
|
+
end
|
413
|
+
|
414
|
+
This states that the value of +len+ is always equal to the length of +data+.
|
415
|
+
+len+ may not be manually modified.
|
416
|
+
|
417
|
+
Combining these two definitions gives the definition for +PascalString+ as
|
418
|
+
previously defined.
|
419
|
+
|
420
|
+
It is important to note with dependencies, that a field can only depend on one
|
421
|
+
before it. You can't have a string which has the characters first and the
|
422
|
+
length afterwards.
|
423
|
+
|
424
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
425
|
+
|
426
|
+
== Numerics
|
427
|
+
|
428
|
+
There are three kinds of numeric types that are supported by BinData.
|
429
|
+
|
430
|
+
=== Byte based integers
|
431
|
+
|
432
|
+
These are the common integers that are used in most low level programming
|
433
|
+
languages (C, C++, Java etc). These integers can be signed or unsigned.
|
434
|
+
The endian must be specified so that the conversion is independent of
|
435
|
+
architecture. The bit size of these integers must be a multiple of 8.
|
436
|
+
Examples of byte based integers are:
|
437
|
+
|
438
|
+
* uint16be - unsigned 16 bit big endian integer
|
439
|
+
* int8 - signed 8 bit integer
|
440
|
+
* int32le - signed 32 bit little endian integer
|
441
|
+
* uint40be - unsigned 40 bit big endian integer
|
442
|
+
|
443
|
+
The be | le suffix may be omitted if the endian keyword is in use.
|
444
|
+
|
445
|
+
=== Bit based integers
|
446
|
+
|
447
|
+
These unsigned integers are used to define bitfields in records. Bitfields
|
448
|
+
are big endian by default but little endian may be specified explicitly.
|
449
|
+
Little endian bitfields are rare, but do occur in older file formats
|
450
|
+
(e.g. The file allocation table for FAT12 filesystems is stored as an
|
451
|
+
array of 12bit little endian integers).
|
452
|
+
|
453
|
+
An array of bit based integers will be packed according to their endian.
|
454
|
+
|
455
|
+
In a record, adjacent bitfields will be packed according to their endian.
|
456
|
+
All other fields are byte aligned.
|
457
|
+
|
458
|
+
Examples of bit based integers are:
|
459
|
+
|
460
|
+
* bit1 - 1 bit big endian integer (may be used as boolean)
|
461
|
+
* bit4_le - 4 bit little endian integer
|
462
|
+
* bit32 - 32 bit big endian integer
|
463
|
+
|
464
|
+
The difference between byte and bit base integers of the same number of
|
465
|
+
bits (e.g. uint8 vs bit8) is one of alignment.
|
466
|
+
|
467
|
+
This example is packed as 3 bytes
|
468
|
+
|
469
|
+
class A < BinData::Record
|
470
|
+
bit4 :a
|
471
|
+
uint8 :b
|
472
|
+
bit4 :c
|
473
|
+
end
|
474
|
+
|
475
|
+
Data is stored as: AAAA0000 BBBBBBBB CCCC0000
|
476
|
+
|
477
|
+
Whereas this example is packed into only 2 bytes
|
478
|
+
|
479
|
+
class B < BinData::Record
|
480
|
+
bit4 :a
|
481
|
+
bit8 :b
|
482
|
+
bit4 :c
|
483
|
+
end
|
484
|
+
|
485
|
+
Data is stored as: AAAABBBB BBBBCCCC
|
486
|
+
|
487
|
+
=== Floating point numbers
|
488
|
+
|
489
|
+
BinData supports 32 and 64 bit floating point numbers, in both big and
|
490
|
+
little endian format. These types are:
|
491
|
+
|
492
|
+
* float_le - single precision 32 bit little endian float
|
493
|
+
* float_be - single precision 32 bit big endian float
|
494
|
+
* double_le - double precision 64 bit little endian float
|
495
|
+
* double_be - double precision 64 bit big endian float
|
496
|
+
|
497
|
+
The _be | _le suffix may be omitted if the endian keyword is in use.
|
498
|
+
|
499
|
+
== Example
|
500
|
+
|
501
|
+
Here is an example declaration for an Internet Protocol network packet.
|
502
|
+
Three of the fields have parameters.
|
503
|
+
|
504
|
+
* The version field always has the value 4, as per the standard.
|
505
|
+
* The options field is read as a raw string, but not processed.
|
506
|
+
* The data field contains the payload of the packet. Its length is
|
507
|
+
calculated as the total length of the packet minus the length of the
|
508
|
+
header.
|
509
|
+
|
510
|
+
class IP_PDU < BinData::Record
|
511
|
+
endian :big
|
512
|
+
|
513
|
+
bit4 :version, :value => 4
|
514
|
+
bit4 :header_length
|
515
|
+
uint8 :tos
|
516
|
+
uint16 :total_length
|
517
|
+
uint16 :ident
|
518
|
+
bit3 :flags
|
519
|
+
bit13 :frag_offset
|
520
|
+
uint8 :ttl
|
521
|
+
uint8 :protocol
|
522
|
+
uint16 :checksum
|
523
|
+
uint32 :src_addr
|
524
|
+
uint32 :dest_addr
|
525
|
+
string :options, :read_length => :options_length_in_bytes
|
526
|
+
string :data, :read_length => lambda { total_length - header_length_in_bytes }
|
527
|
+
|
528
|
+
def header_length_in_bytes
|
529
|
+
header_length * 4
|
530
|
+
end
|
531
|
+
|
532
|
+
def options_length_in_bytes
|
533
|
+
header_length_in_bytes - 20
|
534
|
+
end
|
535
|
+
end
|
536
|
+
|
537
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
538
|
+
|
539
|
+
== Strings
|
540
|
+
|
541
|
+
BinData supports two types of strings - fixed size and zero terminated.
|
542
|
+
Strings are treated as a sequence of 8bit bytes. This is the same as
|
543
|
+
strings in Ruby 1.8. The issue of character encoding is ignored by
|
544
|
+
BinData.
|
545
|
+
|
546
|
+
=== Fixed Sized Strings
|
547
|
+
|
548
|
+
Fixed sized strings may have a set length. If an assigned value is shorter
|
549
|
+
than this length, it will be padded to this length. If no length is set,
|
550
|
+
the length is taken to be the length of the assigned value.
|
551
|
+
|
552
|
+
There are several parameters that are specific to fixed sized strings.
|
553
|
+
|
554
|
+
:read_length
|
555
|
+
|
556
|
+
The length to use when reading a value.
|
557
|
+
|
558
|
+
obj = BinData::String.new(:read_length => 5)
|
559
|
+
obj.read("abcdefghij")
|
560
|
+
obj.value #=> "abcde"
|
561
|
+
|
562
|
+
:length
|
563
|
+
|
564
|
+
The fixed length of the string. If a shorter string is set, it will be
|
565
|
+
padded to this length. Longer strings will be truncated.
|
566
|
+
|
567
|
+
obj = BinData::String.new(:length => 6)
|
568
|
+
obj.read("abcdefghij")
|
569
|
+
obj.value #=> "abcdef"
|
570
|
+
|
571
|
+
obj = BinData::String.new(:length => 6)
|
572
|
+
obj.value = "abcd"
|
573
|
+
obj.value #=> "abcd\000\000"
|
574
|
+
|
575
|
+
obj = BinData::String.new(:length => 6)
|
576
|
+
obj.value = "abcdefghij"
|
577
|
+
obj.value #=> "abcdef"
|
578
|
+
|
579
|
+
:pad_char
|
580
|
+
|
581
|
+
The character to use when padding a string to a set length. Valid values
|
582
|
+
are Integers and Strings of length 1. "\0" is the default.
|
583
|
+
|
584
|
+
obj = BinData::String.new(:length => 6, :pad_char => 'A')
|
585
|
+
obj.value = "abcd"
|
586
|
+
obj.value #=> "abcdAA"
|
587
|
+
obj.to_binary_s #=> "abcdAA"
|
588
|
+
|
589
|
+
:trim_padding
|
590
|
+
|
591
|
+
Boolean, default false. If set, the value of this string will have all
|
592
|
+
pad_chars trimmed from the end of the string. The value will not be
|
593
|
+
trimmed when writing.
|
594
|
+
|
595
|
+
obj = BinData::String.new(:length => 6, :trim_value => true)
|
596
|
+
obj.value = "abcd"
|
597
|
+
obj.value #=> "abcd"
|
598
|
+
obj.to_binary_s #=> "abcd\000\000"
|
599
|
+
|
600
|
+
=== Zero Terminated Strings
|
601
|
+
|
602
|
+
These strings are modelled on the C style of string - a sequence of
|
603
|
+
bytes terminated by a null ("\0") character.
|
604
|
+
|
605
|
+
obj = BinData::Stringz.new
|
606
|
+
obj.read("abcd\000efgh")
|
607
|
+
obj.value #=> "abcd"
|
608
|
+
obj.num_bytes #=> 5
|
609
|
+
obj.to_binary_s #=> "abcd\000"
|
610
|
+
|
611
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
612
|
+
|
613
|
+
== Arrays
|
614
|
+
|
615
|
+
A BinData array is a list of data objects of the same type. It behaves
|
616
|
+
much the same as the standard Ruby array, supporting most of the common
|
617
|
+
methods.
|
618
|
+
|
619
|
+
When instantiating an array, the type of object it contains must be
|
620
|
+
specified.
|
621
|
+
|
622
|
+
arr = BinData::Array.new(:type => :uint8)
|
623
|
+
arr[3] = 5
|
624
|
+
arr.snapshot #=> [0, 0, 0, 5]
|
625
|
+
|
626
|
+
Parameters can be passed to this object with a slightly clumsy syntax.
|
627
|
+
|
628
|
+
arr = BinData::Array.new(:type => [:uint8, {:initial_value => :index}])
|
629
|
+
arr[3] = 5
|
630
|
+
arr.snapshot #=> [0, 1, 2, 5]
|
631
|
+
|
632
|
+
There are two different parameters that specify the length of the array.
|
633
|
+
|
634
|
+
:initial_length
|
635
|
+
|
636
|
+
obj = BinData::Array.new(:type => :int8, :initial_length => 4)
|
637
|
+
obj.read("\002\003\004\005\006\007")
|
638
|
+
obj.snapshot #=> [2, 3, 4, 5]
|
639
|
+
|
640
|
+
:read_until
|
641
|
+
|
642
|
+
While reading, elements are read until this condition is true. This is
|
643
|
+
typically used to read an array until a sentinel value is found. The
|
644
|
+
variables +index+, +element+ and +array+ are made available to any lambda
|
645
|
+
assigned to this parameter. If the value of this parameter is the symbol
|
646
|
+
:eof, then the array will read as much data from the stream as possible.
|
647
|
+
|
648
|
+
obj = BinData::Array.new(:type => :int8,
|
649
|
+
:read_until => lambda { index == 1 })
|
650
|
+
obj.read("\002\003\004\005\006\007")
|
651
|
+
obj.snapshot #=> [2, 3]
|
652
|
+
|
653
|
+
obj = BinData::Array.new(:type => :int8,
|
654
|
+
:read_until => lambda { element >= 3.5 })
|
655
|
+
obj.read("\002\003\004\005\006\007")
|
656
|
+
obj.snapshot #=> [2, 3, 4]
|
657
|
+
|
658
|
+
obj = BinData::Array.new(:type => :int8,
|
659
|
+
:read_until => lambda { array[index] + array[index - 1] == 9 })
|
660
|
+
obj.read("\002\003\004\005\006\007")
|
661
|
+
obj.snapshot #=> [2, 3, 4, 5]
|
662
|
+
|
663
|
+
obj = BinData::Array.new(:type => :int8, :read_until => :eof)
|
664
|
+
obj.read("\002\003\004\005\006\007")
|
665
|
+
obj.snapshot #=> [2, 3, 4, 5, 6, 7]
|
666
|
+
|
667
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
668
|
+
|
669
|
+
== Offset checking / adjustment
|
670
|
+
|
671
|
+
TODO
|
672
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
673
|
+
|
674
|
+
== Choices
|
675
|
+
|
676
|
+
A Choice is a collection of data objects of which only one is active at any
|
677
|
+
particular time. Method calls will be delegated to the active choice.
|
678
|
+
The possible types of objects that a choice contains is controlled by the
|
679
|
+
:choices parameter, while the :selection parameter specifies the active
|
680
|
+
choice.
|
681
|
+
|
682
|
+
:choices
|
683
|
+
|
684
|
+
Either an array or a hash specifying the possible data objects. The
|
685
|
+
format of the array/hash.values is a list of symbols representing the
|
686
|
+
data object type. If a choice is to have params passed to it, then it
|
687
|
+
should be provided as [type_symbol, hash_params]. An implementation
|
688
|
+
constraint is that the hash may not contain symbols as keys.
|
689
|
+
|
690
|
+
:selection
|
691
|
+
|
692
|
+
An index/key into the :choices array/hash which specifies the currently
|
693
|
+
active choice.
|
694
|
+
|
695
|
+
:copy_on_change
|
696
|
+
|
697
|
+
If set to true, copy the value of the previous selection to the current
|
698
|
+
selection whenever the selection changes. Default is false.
|
699
|
+
|
700
|
+
Examples
|
701
|
+
|
702
|
+
type1 = [:string, {:value => "Type1"}]
|
703
|
+
type2 = [:string, {:value => "Type2"}]
|
704
|
+
|
705
|
+
choices = {5 => type1, 17 => type2}
|
706
|
+
obj = BinData::Choice.new(:choices => choices, :selection => 5)
|
707
|
+
obj.value # => "Type1"
|
708
|
+
|
709
|
+
choices = [ type1, type2 ]
|
710
|
+
obj = BinData::Choice.new(:choices => choices, :selection => 1)
|
711
|
+
obj.value # => "Type2"
|
712
|
+
|
713
|
+
choices = [ nil, nil, nil, type1, nil, type2 ]
|
714
|
+
obj = BinData::Choice.new(:choices => choices, :selection => 3)
|
715
|
+
obj.value # => "Type1"
|
716
|
+
|
717
|
+
class MyNumber < BinData::Record
|
718
|
+
int8 :is_big_endian
|
719
|
+
choice :data, :choices => { true => :int32be, false => :int32le },
|
720
|
+
:selection => lambda { is_big_endian != 0 },
|
721
|
+
:copy_on_change => true
|
722
|
+
end
|
723
|
+
|
724
|
+
obj = MyNumber.new
|
725
|
+
obj.is_big_endian = 1
|
726
|
+
obj.data = 5
|
727
|
+
obj.to_binary_s #=> "\001\000\000\000\005"
|
728
|
+
|
729
|
+
obj.is_big_endian = 0
|
730
|
+
obj.to_binary_s #=> "\000\005\000\000\000"
|
731
|
+
|
732
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
733
|
+
|
734
|
+
== Wrappers
|
735
|
+
|
736
|
+
Sometimes you wish to create a new type that is simply an existing type
|
737
|
+
with some predefined parameters. Examples could be an array with a
|
738
|
+
specified type, or an integer with an initial value.
|
739
|
+
|
740
|
+
This can be achieved with a wrapper. A wrapper creates a new type based on
|
741
|
+
an existing type which has predefined parameters. These parameters can of
|
742
|
+
course be overridden at initialisation time.
|
743
|
+
Here we define an array that contains big endian 16 bit integers. The
|
744
|
+
array has a preferred initial length.
|
745
|
+
|
746
|
+
class IntArray < BinData::Wrapper
|
747
|
+
endian :big
|
748
|
+
array :type => :uint16, :initial_length => 5
|
749
|
+
end
|
750
|
+
|
751
|
+
arr = IntArray.new
|
752
|
+
arr.size #=> 5
|
753
|
+
|
754
|
+
The initial length can be overridden at initialisation time.
|
755
|
+
|
756
|
+
arr = IntArray.new(:initial_length => 8)
|
757
|
+
arr.size #=> 8
|
758
|
+
|
759
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
760
|
+
|
761
|
+
== Parameterizing User Defined Types
|
762
|
+
|
763
|
+
All BinData types have parameters that allow the behaviour of an object to
|
764
|
+
be specified at initialization time. User defined types may also specify
|
765
|
+
parameters. There are two types of parameters - mandatory and default.
|
766
|
+
|
767
|
+
=== Mandatory Parameters
|
768
|
+
|
769
|
+
Mandatory parameters must be specified when creating an instance of the
|
770
|
+
type. The :type parameter of Array is an example of a mandatory type.
|
771
|
+
|
772
|
+
class IntArray < BinData::Wrapper
|
773
|
+
mandatory_parameter :half_count
|
774
|
+
|
775
|
+
array :type => :uint8, :initial_length => lambda { half_count * 2 }
|
776
|
+
end
|
777
|
+
|
778
|
+
arr = IntArray.new #=> raises ArgumentError: parameter 'half_count' must
|
779
|
+
be specified in IntArray
|
780
|
+
|
781
|
+
arr = IntArray.new(:half_count => lambda { 1 + 2 })
|
782
|
+
arr.snapshot #=> [0, 0, 0, 0, 0, 0]
|
783
|
+
|
784
|
+
=== Default Parameters
|
785
|
+
|
786
|
+
Default parameters are optional. These parameters have a default value
|
787
|
+
that may be overridden when an instance of the type is created.
|
788
|
+
|
789
|
+
class Phrase < BinData::Primitive
|
790
|
+
default_parameter :number => "three"
|
791
|
+
default_parameter :adjective => "blind"
|
792
|
+
default_parameter :noun => "mice"
|
793
|
+
|
794
|
+
stringz :a, :initial_value => :number
|
795
|
+
stringz :b, :initial_value => :adjective
|
796
|
+
stringz :c, :initial_value => :noun
|
797
|
+
|
798
|
+
def get; "#{a} #{b} #{c}"; end
|
799
|
+
def set(v)
|
800
|
+
if /(.*) (.*) (.*)/ =~ v
|
801
|
+
self.a, self.b, self.c = $1, $2, $3
|
802
|
+
end
|
803
|
+
end
|
804
|
+
end
|
805
|
+
|
806
|
+
obj = Phrase.new(:number => "two", :adjective => "deaf")
|
807
|
+
obj.to_s #=> "two deaf mice"
|
808
|
+
|
809
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
810
|
+
|
811
|
+
== User Defined Primitive Types
|
812
|
+
|
813
|
+
Most user defined types will be Records, but occasionally we'd like to
|
814
|
+
create a custom type of primitive.
|
815
|
+
|
816
|
+
Let us revisit the Pascal String example.
|
817
|
+
|
818
|
+
class PascalString < BinData::Record
|
819
|
+
uint8 :len, :value => lambda { data.length }
|
820
|
+
string :data, :read_length => :len
|
821
|
+
end
|
822
|
+
|
823
|
+
We'd like to make PascalString a user defined type that behaves like a
|
824
|
+
BinData::BasePrimitive object so we can use :initial_value etc. Here's an
|
825
|
+
example usage of what we'd like:
|
826
|
+
|
827
|
+
class Favourites < BinData::Record
|
828
|
+
pascal_string :language, :initial_value => "ruby"
|
829
|
+
pascal_string :os, :initial_value => "unix"
|
830
|
+
end
|
831
|
+
|
832
|
+
f = Favourites.new
|
833
|
+
f.os = "freebsd"
|
834
|
+
f.to_binary_s #=> "\004ruby\007freebsd"
|
835
|
+
|
836
|
+
We create this type of custom string by inheriting from BinData::Primitive
|
837
|
+
(instead of BinData::Record) and implementing the #get and #set methods.
|
838
|
+
|
839
|
+
class PascalString < BinData::Primitive
|
840
|
+
uint8 :len, :value => lambda { data.length }
|
841
|
+
string :data, :read_length => :len
|
842
|
+
|
843
|
+
def get; self.data; end
|
844
|
+
def set(v) self.data = v; end
|
845
|
+
end
|
846
|
+
|
847
|
+
=== Advanced User Defined Primitive Types
|
848
|
+
|
849
|
+
Sometimes a user defined primitive type can not easily be declaratively
|
850
|
+
defined. In this case you should inherit from BinData::BasePrimitive and
|
851
|
+
implement the following three methods:
|
852
|
+
|
853
|
+
* value_to_binary_string(value)
|
854
|
+
* read_and_return_value(io)
|
855
|
+
* sensible_default()
|
856
|
+
|
857
|
+
# A custom big integer format. Binary format is:
|
858
|
+
# 1 byte : 0 for positive, non zero for negative
|
859
|
+
# x bytes : Little endian stream of 7 bit bytes representing the
|
860
|
+
# positive form of the integer. The upper bit of each byte
|
861
|
+
# is set when there are more bytes in the stream.
|
862
|
+
class BigInteger < BinData::BasePrimitive
|
863
|
+
def value_to_binary_string(value)
|
864
|
+
negative = (value < 0) ? 1 : 0
|
865
|
+
value = value.abs
|
866
|
+
bytes = [negative]
|
867
|
+
loop do
|
868
|
+
seven_bit_byte = value & 0x7f
|
869
|
+
value >>= 7
|
870
|
+
has_more = value.nonzero? ? 0x80 : 0
|
871
|
+
byte = has_more | seven_bit_byte
|
872
|
+
bytes.push(byte)
|
873
|
+
|
874
|
+
break if has_more.zero?
|
875
|
+
end
|
876
|
+
|
877
|
+
bytes.inject("") { |str, b| str << b.chr }
|
878
|
+
end
|
879
|
+
|
880
|
+
def read_and_return_value(io)
|
881
|
+
negative = read_uint8(io).nonzero?
|
882
|
+
value = 0
|
883
|
+
bit_shift = 0
|
884
|
+
loop do
|
885
|
+
byte = read_uint8(io)
|
886
|
+
has_more = byte & 0x80
|
887
|
+
seven_bit_byte = byte & 0x7f
|
888
|
+
value |= seven_bit_byte << bit_shift
|
889
|
+
bit_shift += 7
|
890
|
+
|
891
|
+
break if has_more.zero?
|
892
|
+
end
|
893
|
+
|
894
|
+
negative ? -value : value
|
895
|
+
end
|
896
|
+
|
897
|
+
def sensible_default
|
898
|
+
0
|
899
|
+
end
|
900
|
+
|
901
|
+
def read_uint8(io)
|
902
|
+
io.readbytes(1).unpack("C").at(0)
|
903
|
+
end
|
904
|
+
end
|
905
|
+
|
906
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
907
|
+
|
908
|
+
== Debugging
|
909
|
+
|
910
|
+
TODO
|
911
|
+
=== Tracing
|
912
|
+
|
913
|
+
class A < BinData::Record
|
914
|
+
int8 :a
|
915
|
+
bit4 :b
|
916
|
+
bit2 :c
|
917
|
+
array :d, :initial_length => 6, :type => :bit1
|
918
|
+
end
|
919
|
+
|
920
|
+
BinData::trace_reading do
|
921
|
+
A.read("\373\225\220")
|
922
|
+
end
|
923
|
+
|
924
|
+
obj.a => -5
|
925
|
+
obj.b => 9
|
926
|
+
obj.c => 1
|
927
|
+
obj.d[0] => 0
|
928
|
+
obj.d[1] => 1
|
929
|
+
obj.d[2] => 1
|
930
|
+
obj.d[3] => 0
|
931
|
+
obj.d[4] => 0
|
932
|
+
obj.d[5] => 1
|
933
|
+
|
934
|
+
=== Rest
|
935
|
+
|
936
|
+
=== Hidden fields
|
937
|
+
|
938
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
939
|
+
|
940
|
+
== Comparison
|
941
|
+
|
942
|
+
TODO
|
943
|
+
http://github.com/marcandre/packable/tree/master
|
944
|
+
http://metafuzz.rubyforge.org/binstruct/
|
945
|
+
http://rubyforge.org/projects/bitpack/
|
946
|
+
http://binaryparse.rubyforge.org/
|
947
|
+
http://redshift.sourceforge.net/bit-struct/
|
948
|
+
|
949
|
+
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|