jbangert-bindata 1.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (71) hide show
  1. data/.gitignore +1 -0
  2. data/BSDL +22 -0
  3. data/COPYING +52 -0
  4. data/ChangeLog.rdoc +204 -0
  5. data/Gemfile +2 -0
  6. data/INSTALL +11 -0
  7. data/NEWS.rdoc +164 -0
  8. data/README.md +54 -0
  9. data/Rakefile +13 -0
  10. data/bindata.gemspec +31 -0
  11. data/doc/manual.haml +407 -0
  12. data/doc/manual.md +1649 -0
  13. data/examples/NBT.txt +149 -0
  14. data/examples/gzip.rb +161 -0
  15. data/examples/ip_address.rb +22 -0
  16. data/examples/list.rb +124 -0
  17. data/examples/nbt.rb +178 -0
  18. data/lib/bindata.rb +33 -0
  19. data/lib/bindata/alignment.rb +83 -0
  20. data/lib/bindata/array.rb +335 -0
  21. data/lib/bindata/base.rb +388 -0
  22. data/lib/bindata/base_primitive.rb +214 -0
  23. data/lib/bindata/bits.rb +87 -0
  24. data/lib/bindata/choice.rb +216 -0
  25. data/lib/bindata/count_bytes_remaining.rb +35 -0
  26. data/lib/bindata/deprecated.rb +50 -0
  27. data/lib/bindata/dsl.rb +312 -0
  28. data/lib/bindata/float.rb +80 -0
  29. data/lib/bindata/int.rb +184 -0
  30. data/lib/bindata/io.rb +274 -0
  31. data/lib/bindata/lazy.rb +105 -0
  32. data/lib/bindata/offset.rb +91 -0
  33. data/lib/bindata/params.rb +135 -0
  34. data/lib/bindata/primitive.rb +135 -0
  35. data/lib/bindata/record.rb +110 -0
  36. data/lib/bindata/registry.rb +92 -0
  37. data/lib/bindata/rest.rb +35 -0
  38. data/lib/bindata/sanitize.rb +290 -0
  39. data/lib/bindata/skip.rb +48 -0
  40. data/lib/bindata/string.rb +145 -0
  41. data/lib/bindata/stringz.rb +96 -0
  42. data/lib/bindata/struct.rb +388 -0
  43. data/lib/bindata/trace.rb +94 -0
  44. data/lib/bindata/version.rb +3 -0
  45. data/setup.rb +1585 -0
  46. data/spec/alignment_spec.rb +61 -0
  47. data/spec/array_spec.rb +331 -0
  48. data/spec/base_primitive_spec.rb +238 -0
  49. data/spec/base_spec.rb +376 -0
  50. data/spec/bits_spec.rb +163 -0
  51. data/spec/choice_spec.rb +263 -0
  52. data/spec/count_bytes_remaining_spec.rb +38 -0
  53. data/spec/deprecated_spec.rb +31 -0
  54. data/spec/example.rb +21 -0
  55. data/spec/float_spec.rb +37 -0
  56. data/spec/int_spec.rb +216 -0
  57. data/spec/io_spec.rb +352 -0
  58. data/spec/lazy_spec.rb +217 -0
  59. data/spec/primitive_spec.rb +202 -0
  60. data/spec/record_spec.rb +530 -0
  61. data/spec/registry_spec.rb +108 -0
  62. data/spec/rest_spec.rb +26 -0
  63. data/spec/skip_spec.rb +27 -0
  64. data/spec/spec_common.rb +58 -0
  65. data/spec/string_spec.rb +300 -0
  66. data/spec/stringz_spec.rb +118 -0
  67. data/spec/struct_spec.rb +350 -0
  68. data/spec/system_spec.rb +380 -0
  69. data/tasks/manual.rake +36 -0
  70. data/tasks/rspec.rake +17 -0
  71. metadata +208 -0
@@ -0,0 +1,149 @@
1
+ Named Binary Tag specification
2
+
3
+ NBT (Named Binary Tag) is a tag based binary format designed to carry large amounts of binary data with smaller amounts of additional data.
4
+ An NBT file consists of a single GZIPped Named Tag of type TAG_Compound.
5
+
6
+ A Named Tag has the following format:
7
+
8
+ byte tagType
9
+ TAG_String name
10
+ [payload]
11
+
12
+ The tagType is a single byte defining the contents of the payload of the tag.
13
+
14
+ The name is a descriptive name, and can be anything (eg "cat", "banana", "Hello World!"). It has nothing to do with the tagType.
15
+ The purpose for this name is to name tags so parsing is easier and can be made to only look for certain recognized tag names.
16
+ Exception: If tagType is TAG_End, the name is skipped and assumed to be "".
17
+
18
+ The [payload] varies by tagType.
19
+
20
+ Note that ONLY Named Tags carry the name and tagType data. Explicitly identified Tags (such as TAG_String above) only contains the payload.
21
+
22
+
23
+ The tag types and respective payloads are:
24
+
25
+ TYPE: 0 NAME: TAG_End
26
+ Payload: None.
27
+ Note: This tag is used to mark the end of a list.
28
+ Cannot be named! If type 0 appears where a Named Tag is expected, the name is assumed to be "".
29
+ (In other words, this Tag is always just a single 0 byte when named, and nothing in all other cases)
30
+
31
+ TYPE: 1 NAME: TAG_Byte
32
+ Payload: A single signed byte (8 bits)
33
+
34
+ TYPE: 2 NAME: TAG_Short
35
+ Payload: A signed short (16 bits, big endian)
36
+
37
+ TYPE: 3 NAME: TAG_Int
38
+ Payload: A signed short (32 bits, big endian)
39
+
40
+ TYPE: 4 NAME: TAG_Long
41
+ Payload: A signed long (64 bits, big endian)
42
+
43
+ TYPE: 5 NAME: TAG_Float
44
+ Payload: A floating point value (32 bits, big endian, IEEE 754-2008, binary32)
45
+
46
+ TYPE: 6 NAME: TAG_Double
47
+ Payload: A floating point value (64 bits, big endian, IEEE 754-2008, binary64)
48
+
49
+ TYPE: 7 NAME: TAG_Byte_Array
50
+ Payload: TAG_Int length
51
+ An array of bytes of unspecified format. The length of this array is <length> bytes
52
+
53
+ TYPE: 8 NAME: TAG_String
54
+ Payload: TAG_Short length
55
+ An array of bytes defining a string in UTF-8 format. The length of this array is <length> bytes
56
+
57
+ TYPE: 9 NAME: TAG_List
58
+ Payload: TAG_Byte tagId
59
+ TAG_Int length
60
+ A sequential list of Tags (not Named Tags), of type <typeId>. The length of this array is <length> Tags
61
+ Notes: All tags share the same type.
62
+
63
+ TYPE: 10 NAME: TAG_Compound
64
+ Payload: A sequential list of Named Tags. This array keeps going until a TAG_End is found.
65
+ TAG_End end
66
+ Notes: If there's a nested TAG_Compound within this tag, that one will also have a TAG_End, so simply reading until the next TAG_End will not work.
67
+ The names of the named tags have to be unique within each TAG_Compound
68
+ The order of the tags is not guaranteed.
69
+
70
+
71
+
72
+
73
+
74
+ Decoding example:
75
+ (Use http://www.minecraft.net/docs/test.nbt to test your implementation)
76
+
77
+
78
+ First we start by reading a Named Tag.
79
+ After unzipping the stream, the first byte is a 10. That means the tag is a TAG_Compound (as expected by the specification).
80
+
81
+ The next two bytes are 0 and 11, meaning the name string consists of 11 UTF-8 characters. In this case, they happen to be "hello world".
82
+ That means our root tag is named "hello world". We can now move on to the payload.
83
+
84
+ From the specification, we see that TAG_Compound consists of a series of Named Tags, so we read another byte to find the tagType.
85
+ It happens to be an 8. The name is 4 letters long, and happens to be "name". Type 8 is TAG_String, meaning we read another two bytes to get the length,
86
+ then read that many bytes to get the contents. In this case, it's "Bananrama".
87
+
88
+ So now we know the TAG_Compound contains a TAG_String named "name" with the content "Bananrama"
89
+
90
+ We move on to reading the next Named Tag, and get a 0. This is TAG_End, which always has an implied name of "". That means that the list of entries
91
+ in the TAG_Compound is over, and indeed all of the NBT file.
92
+
93
+ So we ended up with this:
94
+
95
+ TAG_Compound("hello world"): 1 entries
96
+ {
97
+ TAG_String("name"): Bananrama
98
+ }
99
+
100
+
101
+
102
+ For a slightly longer test, download http://www.minecraft.net/docs/bigtest.nbt
103
+ You should end up with this:
104
+
105
+ TAG_Compound("Level"): 11 entries
106
+ {
107
+ TAG_Short("shortTest"): 32767
108
+ TAG_Long("longTest"): 9223372036854775807
109
+ TAG_Float("floatTest"): 0.49823147
110
+ TAG_String("stringTest"): HELLO WORLD THIS IS A TEST STRING ���!
111
+ TAG_Int("intTest"): 2147483647
112
+ TAG_Compound("nested compound test"): 2 entries
113
+ {
114
+ TAG_Compound("ham"): 2 entries
115
+ {
116
+ TAG_String("name"): Hampus
117
+ TAG_Float("value"): 0.75
118
+ }
119
+ TAG_Compound("egg"): 2 entries
120
+ {
121
+ TAG_String("name"): Eggbert
122
+ TAG_Float("value"): 0.5
123
+ }
124
+ }
125
+ TAG_List("listTest (long)"): 5 entries of type TAG_Long
126
+ {
127
+ TAG_Long: 11
128
+ TAG_Long: 12
129
+ TAG_Long: 13
130
+ TAG_Long: 14
131
+ TAG_Long: 15
132
+ }
133
+ TAG_Byte("byteTest"): 127
134
+ TAG_List("listTest (compound)"): 2 entries of type TAG_Compound
135
+ {
136
+ TAG_Compound: 2 entries
137
+ {
138
+ TAG_String("name"): Compound tag #0
139
+ TAG_Long("created-on"): 1264099775885
140
+ }
141
+ TAG_Compound: 2 entries
142
+ {
143
+ TAG_String("name"): Compound tag #1
144
+ TAG_Long("created-on"): 1264099775885
145
+ }
146
+ }
147
+ TAG_Byte_Array("byteArrayTest (the first 1000 values of (n*n*255+n*7)%100, starting with n=0 (0, 62, 34, 16, 8, ...))"): [1000 bytes]
148
+ TAG_Double("doubleTest"): 0.4931287132182315
149
+ }
@@ -0,0 +1,161 @@
1
+ require 'bindata'
2
+ require 'forwardable'
3
+
4
+ # An example of a reader / writer for the GZIP file format as per rfc1952.
5
+ # Note that compression is not implemented to keep the example small.
6
+ class Gzip
7
+ extend Forwardable
8
+
9
+ # Known compression methods
10
+ DEFLATE = 8
11
+
12
+ class Extra < BinData::Record
13
+ endian :little
14
+
15
+ uint16 :len, :length => lambda { data.length }
16
+ string :data, :read_length => :len
17
+ end
18
+
19
+ class Header < BinData::Record
20
+ endian :little
21
+
22
+ uint16 :ident, :value => 0x8b1f, :check_value => 0x8b1f
23
+ uint8 :compression_method, :initial_value => DEFLATE
24
+
25
+ bit3 :freserved, :value => 0, :check_value => 0
26
+ bit1 :fcomment, :value => lambda { comment.length > 0 ? 1 : 0 }
27
+ bit1 :ffile_name, :value => lambda { file_name.length > 0 ? 1 : 0 }
28
+ bit1 :fextra, :value => lambda { extra.len > 0 ? 1 : 0 }
29
+ bit1 :fcrc16, :value => 0 # see comment below
30
+ bit1 :ftext
31
+
32
+ # Never include header crc. This is because the current versions of the
33
+ # command-line version of gzip (up through version 1.3.x) do not
34
+ # support header crc's, and will report that it is a "multi-part gzip
35
+ # file" and give up.
36
+
37
+ uint32 :mtime
38
+ uint8 :extra_flags
39
+ uint8 :os, :initial_value => 255 # unknown OS
40
+
41
+ # These fields are optional depending on the bits in flags
42
+ extra :extra, :onlyif => lambda { fextra.nonzero? }
43
+ stringz :file_name, :onlyif => lambda { ffile_name.nonzero? }
44
+ stringz :comment, :onlyif => lambda { fcomment.nonzero? }
45
+ uint16 :crc16, :onlyif => lambda { fcrc16.nonzero? }
46
+ end
47
+
48
+ class Footer < BinData::Record
49
+ endian :little
50
+
51
+ uint32 :crc32
52
+ uint32 :uncompressed_size
53
+ end
54
+
55
+ def initialize
56
+ @header = Header.new
57
+ @footer = Footer.new
58
+ end
59
+
60
+ attr_accessor :compressed
61
+ def_delegators :@header, :file_name=, :file_name
62
+ def_delegators :@header, :comment=, :comment
63
+ def_delegators :@header, :compression_method
64
+ def_delegators :@footer, :crc32, :uncompressed_size
65
+
66
+ def mtime
67
+ Time.at(@header.mtime.snapshot)
68
+ end
69
+
70
+ def mtime=(tm)
71
+ @header.mtime = tm.to_i
72
+ end
73
+
74
+ def total_size
75
+ @header.num_bytes + @compressed.size + @footer.num_bytes
76
+ end
77
+
78
+ def compressed_data
79
+ @compressed
80
+ end
81
+
82
+ def set_compressed_data(compressed, crc32, uncompressed_size)
83
+ @compressed = compressed
84
+ @footer.crc32 = crc32
85
+ @footer.uncompressed_size = uncompressed_size
86
+ end
87
+
88
+ def read(file_name)
89
+ File.open(file_name, "r") do |io|
90
+ @header.read(io)
91
+
92
+ # Determine the size of the compressed data. This is needed because
93
+ # we don't actually uncompress the data. Ideally the uncompression
94
+ # method would read the correct number of bytes from the IO and the
95
+ # IO would be positioned ready to read the footer.
96
+
97
+ pos = io.pos
98
+ io.seek(-@footer.num_bytes, IO::SEEK_END)
99
+ compressed_size = io.pos - pos
100
+ io.seek(pos)
101
+
102
+ @compressed = io.read(compressed_size)
103
+ @footer.read(io)
104
+ end
105
+ end
106
+
107
+ def write(file_name)
108
+ File.open(file_name, "w") do |io|
109
+ @header.write(io)
110
+ io.write(@compressed)
111
+ @footer.write(io)
112
+ end
113
+ end
114
+ end
115
+
116
+ if __FILE__ == $0
117
+ # Write a gzip file.
118
+ print "Creating a gzip file ... "
119
+ g = Gzip.new
120
+ # Uncompressed data is "the cat sat on the mat"
121
+ g.set_compressed_data("+\311HUHN,Q(\006\342\374<\205\022 77\261\004\000",
122
+ 3464689835, 22)
123
+ g.file_name = "poetry"
124
+ g.mtime = Time.now
125
+ g.comment = "A stunning piece of prose"
126
+ g.write("poetry.gz")
127
+ puts "done."
128
+ puts
129
+
130
+ # Read the created gzip file.
131
+ print "Reading newly created gzip file ... "
132
+ g = Gzip.new
133
+ g.read("poetry.gz")
134
+ puts "done."
135
+ puts
136
+
137
+ puts "Printing gzip file details in the format of gzip -l -v"
138
+
139
+ # compression ratio
140
+ ratio = 100.0 * (g.uncompressed_size - g.compressed.size) /
141
+ g.uncompressed_size
142
+
143
+ comp_meth = (g.compression_method == Gzip::DEFLATE) ? "defla" : ""
144
+
145
+ # Output using the same format as gzip -l -v
146
+ puts "method crc date time compressed " +
147
+ "uncompressed ratio uncompressed_name"
148
+ puts "%5s %08x %6s %5s %19s %19s %5.1f%% %s" % [comp_meth,
149
+ g.crc32,
150
+ g.mtime.strftime('%b %d'),
151
+ g.mtime.strftime('%H:%M'),
152
+ g.total_size,
153
+ g.uncompressed_size,
154
+ ratio,
155
+ g.file_name]
156
+ puts "Comment: #{g.comment}" if g.comment != ""
157
+ puts
158
+
159
+ puts "Executing gzip -l -v"
160
+ puts `gzip -l -v poetry.gz`
161
+ end
@@ -0,0 +1,22 @@
1
+ require 'bindata'
2
+
3
+ # A custom type representing an IP address.
4
+ # The underlying binary representation is a sequence of four octets.
5
+ # The human accessible representation is a dotted quad.
6
+ class IPAddr < BinData::Primitive
7
+ array :octets, :type => :uint8, :initial_length => 4
8
+
9
+ def set(val)
10
+ ints = val.split(/\./).collect { |int| int.to_i }
11
+ self.octets = ints
12
+ end
13
+
14
+ def get
15
+ self.octets.collect { |octet| "%d" % octet }.join(".")
16
+ end
17
+ end
18
+
19
+ ip = IPAddr.new("127.0.0.1")
20
+
21
+ puts "human readable value: #{ip}" #=> 127.0.0.1
22
+ puts "binary representation: #{ip.to_binary_s.inspect}" #=> "\177\000\000\001"
@@ -0,0 +1,124 @@
1
+ require 'bindata'
2
+
3
+ # An example of a recursively defined data format.
4
+ #
5
+ # This example format describes atoms and lists.
6
+ # It is recursive because lists can contain other lists.
7
+ #
8
+ # Atoms - contain a single integer
9
+ # Lists - contain a mixture of atoms and lists
10
+ #
11
+ # The binary representation is:
12
+ #
13
+ # Atoms - A single byte 'a' followed by an int32 containing the value.
14
+ # Lists - A single byte 'l' followed by an int32 denoting the number of
15
+ # items in the list. This is followed by all the items in the list.
16
+ #
17
+ # All integers are big endian.
18
+ #
19
+ #
20
+ # A first attempt at a declaration would be:
21
+ #
22
+ # class Atom < BinData::Record
23
+ # string :tag, :length => 1, :check_value => 'a'
24
+ # int32be :val
25
+ # end
26
+ #
27
+ # class List < BinData::Record
28
+ # string :tag, :length => 1, :check_value => 'l'
29
+ # int32be :num, :value => lambda { vals.length }
30
+ # array :vals, :initial_length => :num do
31
+ # choice :selection => ??? do
32
+ # atom
33
+ # list
34
+ # end
35
+ # end
36
+ # end
37
+ #
38
+ # Notice how we get stuck on attemping to write a declaration for
39
+ # the contents of the list. We can't determine if the list item is
40
+ # an atom or list because we haven't read it yet. It appears that
41
+ # we can't proceed.
42
+ #
43
+ # The cause of the problem is that the tag identifying the type is
44
+ # coupled with that type.
45
+ #
46
+ # The solution is to decouple the tag from the type. We introduce a
47
+ # new type 'Term' that is a thin container around the tag plus the
48
+ # type (atom or list).
49
+ #
50
+ # The declaration then becomes:
51
+ #
52
+ # class Term < BinData::Record; end # forward declaration
53
+ #
54
+ # class Atom < BinData::Int32be
55
+ # end
56
+ #
57
+ # class List < BinData::Record
58
+ # int32be :num, :value => lambda { vals.length }
59
+ # array :vals, :type => :term, :initial_length => :num
60
+ # end
61
+ #
62
+ # class Term < BinData::Record
63
+ # string :tag, :length => 1
64
+ # choice :term, :selection => :tag do
65
+ # atom 'a'
66
+ # list 'l'
67
+ # end
68
+ # end
69
+
70
+
71
+ class Term < BinData::Record; end # Forward declaration
72
+
73
+ class Atom < BinData::Int32be
74
+ def decode
75
+ snapshot
76
+ end
77
+
78
+ def self.encode(val)
79
+ Atom.new(val)
80
+ end
81
+ end
82
+
83
+ class List < BinData::Record
84
+ int32be :num, :value => lambda { vals.length }
85
+ array :vals, :initial_length => :num, :type => :term
86
+
87
+ def decode
88
+ vals.collect { |v| v.decode }
89
+ end
90
+
91
+ def self.encode(val)
92
+ List.new(:vals => val.collect { |v| Term.encode(v) })
93
+ end
94
+ end
95
+
96
+ class Term < BinData::Record
97
+ string :tag, :length => 1
98
+ choice :term, :selection => :tag do
99
+ atom 'a'
100
+ list 'l'
101
+ end
102
+
103
+ def decode
104
+ term.decode
105
+ end
106
+
107
+ def self.encode(val)
108
+ if Fixnum === val
109
+ Term.new(:tag => 'a', :term => Atom.encode(val))
110
+ else
111
+ Term.new(:tag => 'l', :term => List.encode(val))
112
+ end
113
+ end
114
+ end
115
+
116
+
117
+ puts "A single Atom"
118
+ p Term.encode(4)
119
+ p Term.encode(4).decode
120
+ puts
121
+
122
+ puts "A nested List"
123
+ p Term.encode([1, [2, 3], 4])
124
+ p Term.encode([1, [2, 3], 4]).decode