messagepack 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.adoc +773 -0
- data/Rakefile +8 -0
- data/docs/Gemfile +7 -0
- data/docs/README.md +85 -0
- data/docs/_config.yml +137 -0
- data/docs/_guides/index.adoc +14 -0
- data/docs/_guides/io-streaming.adoc +226 -0
- data/docs/_guides/migration.adoc +218 -0
- data/docs/_guides/performance.adoc +189 -0
- data/docs/_pages/buffer.adoc +85 -0
- data/docs/_pages/extension-types.adoc +117 -0
- data/docs/_pages/factory-pattern.adoc +115 -0
- data/docs/_pages/index.adoc +20 -0
- data/docs/_pages/serialization.adoc +159 -0
- data/docs/_pages/streaming.adoc +97 -0
- data/docs/_pages/symbol-extension.adoc +69 -0
- data/docs/_pages/timestamp-extension.adoc +88 -0
- data/docs/_references/api.adoc +360 -0
- data/docs/_references/extensions.adoc +198 -0
- data/docs/_references/format.adoc +301 -0
- data/docs/_references/index.adoc +14 -0
- data/docs/_tutorials/extension-types.adoc +170 -0
- data/docs/_tutorials/getting-started.adoc +165 -0
- data/docs/_tutorials/index.adoc +14 -0
- data/docs/_tutorials/thread-safety.adoc +157 -0
- data/docs/index.adoc +77 -0
- data/docs/lychee.toml +42 -0
- data/lib/messagepack/bigint.rb +131 -0
- data/lib/messagepack/buffer.rb +534 -0
- data/lib/messagepack/core_ext.rb +34 -0
- data/lib/messagepack/error.rb +24 -0
- data/lib/messagepack/extensions/base.rb +55 -0
- data/lib/messagepack/extensions/registry.rb +154 -0
- data/lib/messagepack/extensions/symbol.rb +38 -0
- data/lib/messagepack/extensions/timestamp.rb +110 -0
- data/lib/messagepack/extensions/value.rb +38 -0
- data/lib/messagepack/factory.rb +349 -0
- data/lib/messagepack/format.rb +99 -0
- data/lib/messagepack/packer.rb +702 -0
- data/lib/messagepack/symbol.rb +4 -0
- data/lib/messagepack/time.rb +29 -0
- data/lib/messagepack/timestamp.rb +4 -0
- data/lib/messagepack/unpacker.rb +1418 -0
- data/lib/messagepack/version.rb +5 -0
- data/lib/messagepack.rb +81 -0
- metadata +94 -0
data/README.adoc
ADDED
|
@@ -0,0 +1,773 @@
|
|
|
1
|
+
= MessagePack
|
|
2
|
+
|
|
3
|
+
image:https://img.shields.io/gem/v/messagepack.svg[RubyGems Version]
|
|
4
|
+
image:https://img.shields.io/github/license/lutaml/messagepack.svg[License]
|
|
5
|
+
image:https://github.com/lutaml/messagepack/actions/workflows/rake.yml/badge.svg["Build", link="https://github.com/lutaml/messagepack/actions/workflows/rake.yml"]
|
|
6
|
+
|
|
7
|
+
== Purpose
|
|
8
|
+
|
|
9
|
+
MessagePack is a pure Ruby implementation of the
|
|
10
|
+
https://msgpack.org[MessagePack binary serialization format].
|
|
11
|
+
|
|
12
|
+
MessagePack is an efficient binary serialization format that enables exchange of
|
|
13
|
+
data among multiple languages like JSON, but is faster and smaller.
|
|
14
|
+
|
|
15
|
+
This implementation provides:
|
|
16
|
+
|
|
17
|
+
* Pure Ruby implementation (no C extension required)
|
|
18
|
+
* Full compatibility with the MessagePack specification
|
|
19
|
+
* Support for custom extension types
|
|
20
|
+
* Thread-safe factory pattern for packer/unpacker reuse
|
|
21
|
+
* Streaming unpacker for incremental parsing
|
|
22
|
+
* Comprehensive timestamp support with nanosecond precision
|
|
23
|
+
|
|
24
|
+
== Features
|
|
25
|
+
|
|
26
|
+
* link:#core-serialization[Core serialization] - Basic pack and unpack operations
|
|
27
|
+
* link:#performance-optimizations[Performance optimizations] - Efficient native type handling and buffer management
|
|
28
|
+
* link:#factory-pattern[Factory pattern] - Thread-safe packer/unpacker management
|
|
29
|
+
* link:#extension-types[Extension types] - Custom type registration system
|
|
30
|
+
* link:#timestamp-extension[Timestamp extension] - Nanosecond precision time handling
|
|
31
|
+
* link:#symbol-extension[Symbol extension] - Efficient symbol serialization
|
|
32
|
+
* link:#streaming-unpacking[Streaming unpacking] - Incremental data parsing
|
|
33
|
+
* link:#buffer-management[Buffer management] - Chunked binary data storage
|
|
34
|
+
* link:#implementation-details[Implementation details] - Pure Ruby implementation architecture
|
|
35
|
+
|
|
36
|
+
== Architecture
|
|
37
|
+
|
|
38
|
+
.MessagePack serialization architecture
|
|
39
|
+
[source]
|
|
40
|
+
----
|
|
41
|
+
┌───────────────────────────────────────────────────────────┐
|
|
42
|
+
│ User Application │
|
|
43
|
+
└──────────────────────────┬────────────────────────────────┘
|
|
44
|
+
│
|
|
45
|
+
┌────────────┴────────────┐
|
|
46
|
+
│ │
|
|
47
|
+
┌───────────────┐ ┌──────────────────┐
|
|
48
|
+
│ MessagePack │ │ Factory Pattern │
|
|
49
|
+
│ .pack/unpack │ │ (thread-safe) │
|
|
50
|
+
└───────┬───────┘ └────────┬─────────┘
|
|
51
|
+
│ │
|
|
52
|
+
┌───────┴──────┐ ┌───────┴──────────┐
|
|
53
|
+
│ │ │ │
|
|
54
|
+
┌────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐
|
|
55
|
+
│ Packer │ │ Unpacker │ │ Packer │ │ Unpacker │
|
|
56
|
+
│ │ │ │ │ Pool │ │ Pool │
|
|
57
|
+
└──┬─────┘ └─────┬────┘ └────┬─────┘ └────┬────────┘
|
|
58
|
+
│ │ │ │
|
|
59
|
+
└─────┬────────┘ └──────┬──────┘
|
|
60
|
+
│ │
|
|
61
|
+
┌─────┴─────────────────────────────┴────────┐
|
|
62
|
+
│ ┌──────────────────────────────────┐ │
|
|
63
|
+
│ │ BinaryBuffer (chunked) │ │
|
|
64
|
+
│ │ │ │
|
|
65
|
+
│ │ ┌────┬────┬────┬────┬─ ─ ─┐ │ │
|
|
66
|
+
│ │ │ 1 │ 2 │ 3 │ 4 │ N │ │ │
|
|
67
|
+
│ │ └────┴────┴────┴────┴─ ─ ─┘ │ │
|
|
68
|
+
│ └──────────────────────────────────┘ │
|
|
69
|
+
│ ┌──────────────────────────────────┐ │
|
|
70
|
+
│ │ Extension Registry │ │
|
|
71
|
+
│ │ │ │
|
|
72
|
+
│ │ Timestamp (-1) │ │
|
|
73
|
+
│ │ Symbol (0) │ │
|
|
74
|
+
│ │ Custom Types (1-127, -2 to -128)│ │
|
|
75
|
+
│ └──────────────────────────────────┘ │
|
|
76
|
+
└────────────────────────────────────────────┘
|
|
77
|
+
----
|
|
78
|
+
|
|
79
|
+
.MessagePack format encoding
|
|
80
|
+
[source]
|
|
81
|
+
----
|
|
82
|
+
┌──────────────────────────────────────────────────────────────────┐
|
|
83
|
+
│ MessagePack Binary Format │
|
|
84
|
+
└──────────────────────────────────────────────────────────────────┘
|
|
85
|
+
|
|
86
|
+
Positive Fixnum ────────┐
|
|
87
|
+
│
|
|
88
|
+
Negative Fixnum ────────┼── 0x00-0x7F and 0xE0-0xFF
|
|
89
|
+
│ 1 byte format, value embedded
|
|
90
|
+
Nil ────────────────────┤
|
|
91
|
+
│
|
|
92
|
+
Boolean ────────────────┘
|
|
93
|
+
|
|
94
|
+
UInt 8 ───────────────── 0xCC (1 byte format + 1 byte data)
|
|
95
|
+
UInt 16 ──────────────── 0xCD (1 byte format + 2 byte data)
|
|
96
|
+
UInt 32 ──────────────── 0xCE (1 byte format + 4 byte data)
|
|
97
|
+
UInt 64 ──────────────── 0xCF (1 byte format + 8 byte data)
|
|
98
|
+
|
|
99
|
+
Int 8 ────────────────── 0xD0 (1 byte format + 1 byte data)
|
|
100
|
+
Int 16 ───────────────── 0xD1 (1 byte format + 2 byte data)
|
|
101
|
+
Int 32 ───────────────── 0xD2 (1 byte format + 4 byte data)
|
|
102
|
+
Int 64 ───────────────── 0xD3 (1 byte format + 8 byte data)
|
|
103
|
+
|
|
104
|
+
Float 32 ─────────────── 0xCA (1 byte format + 4 byte data)
|
|
105
|
+
Float 64 ─────────────── 0xCB (1 byte format + 8 byte data)
|
|
106
|
+
|
|
107
|
+
FixStr ───────────────── 0xA0-0xBF (1 byte format + 0-31 bytes)
|
|
108
|
+
Str 8 ────────────────── 0xD9 (1 byte format + 1 byte length)
|
|
109
|
+
Str 16 ───────────────── 0xDA (1 byte format + 2 byte length)
|
|
110
|
+
Str 32 ───────────────── 0xDB (1 byte format + 4 byte length)
|
|
111
|
+
|
|
112
|
+
Bin 8 ────────────────── 0xC4 (1 byte format + 1 byte length)
|
|
113
|
+
Bin 16 ───────────────── 0xC5 (1 byte format + 2 byte length)
|
|
114
|
+
Bin 32 ───────────────── 0xC6 (1 byte format + 4 byte length)
|
|
115
|
+
|
|
116
|
+
FixArray ─────────────── 0x90-0x9F (1 byte format + 0-15 elements)
|
|
117
|
+
Array 16 ─────────────── 0xDC (1 byte format + 2 byte count)
|
|
118
|
+
Array 32 ─────────────── 0xDD (1 byte format + 4 byte count)
|
|
119
|
+
|
|
120
|
+
FixMap ───────────────── 0x80-0x8F (1 byte format + 0-15 entries)
|
|
121
|
+
Map 16 ───────────────── 0xDE (1 byte format + 2 byte count)
|
|
122
|
+
Map 32 ───────────────── 0xDF (1 byte format + 4 byte count)
|
|
123
|
+
|
|
124
|
+
FixExt 1 ─────────────── 0xD4 (1 byte format + 1 byte type + 1 byte)
|
|
125
|
+
FixExt 2 ─────────────── 0xD5 (1 byte format + 1 byte type + 2 bytes)
|
|
126
|
+
FixExt 4 ─────────────── 0xD6 (1 byte format + 1 byte type + 4 bytes)
|
|
127
|
+
FixExt 8 ─────────────── 0xD7 (1 byte format + 1 byte type + 8 bytes)
|
|
128
|
+
FixExt 16 ────────────── 0xD8 (1 byte format + 1 byte type + 16 bytes)
|
|
129
|
+
Ext 8 ────────────────── 0xC7 (1 byte format + 1 byte len + 1 byte type)
|
|
130
|
+
Ext 16 ───────────────── 0xC8 (1 byte format + 2 byte len + 1 byte type)
|
|
131
|
+
Ext 32 ───────────────── 0xC9 (1 byte format + 4 byte len + 1 byte type)
|
|
132
|
+
----
|
|
133
|
+
|
|
134
|
+
== Installation
|
|
135
|
+
|
|
136
|
+
Add this line to your application's Gemfile:
|
|
137
|
+
|
|
138
|
+
[source,ruby]
|
|
139
|
+
----
|
|
140
|
+
gem 'messagepack'
|
|
141
|
+
----
|
|
142
|
+
|
|
143
|
+
And then execute:
|
|
144
|
+
|
|
145
|
+
[source,shell]
|
|
146
|
+
----
|
|
147
|
+
bundle install
|
|
148
|
+
----
|
|
149
|
+
|
|
150
|
+
Or install it yourself as:
|
|
151
|
+
|
|
152
|
+
[source,shell]
|
|
153
|
+
----
|
|
154
|
+
gem install messagepack
|
|
155
|
+
----
|
|
156
|
+
|
|
157
|
+
== Core serialization
|
|
158
|
+
|
|
159
|
+
The core MessagePack API provides simple `pack` and `unpack` methods for
|
|
160
|
+
serializing and deserializing Ruby objects.
|
|
161
|
+
|
|
162
|
+
=== Packing objects
|
|
163
|
+
|
|
164
|
+
Use `Messagepack.pack` to serialize Ruby objects to binary format:
|
|
165
|
+
|
|
166
|
+
[source,ruby]
|
|
167
|
+
----
|
|
168
|
+
Messagepack.pack({hello: "world"}) # => "\x81\xA5hello\xA5world"
|
|
169
|
+
----
|
|
170
|
+
|
|
171
|
+
Where,
|
|
172
|
+
|
|
173
|
+
* `Messagepack.pack` accepts any Ruby object as its argument
|
|
174
|
+
* The return value is a binary string containing the serialized data
|
|
175
|
+
* Supported types include: nil, boolean, integer, float, string, array,
|
|
176
|
+
hash, and any registered extension types
|
|
177
|
+
|
|
178
|
+
=== Unpacking data
|
|
179
|
+
|
|
180
|
+
Use `Messagepack.unpack` to deserialize binary data back to Ruby objects:
|
|
181
|
+
|
|
182
|
+
[source,ruby]
|
|
183
|
+
----
|
|
184
|
+
data = Messagepack.pack({hello: "world"})
|
|
185
|
+
Messagepack.unpack(data) # => {"hello"=>"world"}
|
|
186
|
+
----
|
|
187
|
+
|
|
188
|
+
Where,
|
|
189
|
+
|
|
190
|
+
* `Messagepack.unpack` accepts a binary string or IO object
|
|
191
|
+
* The return value is the original Ruby object
|
|
192
|
+
* Extra bytes after the deserialized object will raise a
|
|
193
|
+
`Messagepack::MalformedFormatError`
|
|
194
|
+
|
|
195
|
+
.Using pack and unpack
|
|
196
|
+
====
|
|
197
|
+
[source,ruby]
|
|
198
|
+
----
|
|
199
|
+
# Serialize a complex object
|
|
200
|
+
data = {
|
|
201
|
+
name: "Alice",
|
|
202
|
+
age: 30,
|
|
203
|
+
skills: ["Ruby", "Python"],
|
|
204
|
+
metadata: {
|
|
205
|
+
active: true,
|
|
206
|
+
score: 95.5
|
|
207
|
+
}
|
|
208
|
+
}
|
|
209
|
+
|
|
210
|
+
binary = Messagepack.pack(data)
|
|
211
|
+
# => "\x84\xA4name\xA5Alice\xA3age\x1E\xA6skills\
|
|
212
|
+
# \x92\xA4Ruby\xA6Python\xA8metadata\x82\xA6active\
|
|
213
|
+
# \xC3\xA5score\xCB@_\x00\x00"
|
|
214
|
+
|
|
215
|
+
# Deserialize back to a Ruby object
|
|
216
|
+
result = Messagepack.unpack(binary)
|
|
217
|
+
# => {"name"=>"Alice", "age"=>30, "skills"=>["Ruby", "Python"],
|
|
218
|
+
# "metadata"=>{"active"=>true, "score"=>95.5}}
|
|
219
|
+
----
|
|
220
|
+
====
|
|
221
|
+
|
|
222
|
+
== Factory pattern
|
|
223
|
+
|
|
224
|
+
The `Messagepack::Factory` class provides thread-safe management of packer and
|
|
225
|
+
unpacker instances with support for custom type registrations.
|
|
226
|
+
|
|
227
|
+
=== Creating a factory
|
|
228
|
+
|
|
229
|
+
[source,ruby]
|
|
230
|
+
----
|
|
231
|
+
factory = Messagepack::Factory.new
|
|
232
|
+
----
|
|
233
|
+
|
|
234
|
+
Where,
|
|
235
|
+
|
|
236
|
+
* `Factory.new` creates a new factory instance
|
|
237
|
+
* Each factory maintains its own type registry
|
|
238
|
+
* Factories can be frozen for thread-safe use
|
|
239
|
+
|
|
240
|
+
=== Registering custom types
|
|
241
|
+
|
|
242
|
+
[source,ruby]
|
|
243
|
+
----
|
|
244
|
+
factory.register_type(0x01, MyClass,
|
|
245
|
+
packer: :to_msgpack_ext,
|
|
246
|
+
unpacker: :from_msgpack_ext
|
|
247
|
+
)
|
|
248
|
+
----
|
|
249
|
+
|
|
250
|
+
Where,
|
|
251
|
+
|
|
252
|
+
* `0x01` is the type identifier (must be -128 to 127)
|
|
253
|
+
* `MyClass` is the Ruby class to register
|
|
254
|
+
* `packer` specifies how to serialize instances (symbol, method, or proc)
|
|
255
|
+
* `unpacker` specifies how to deserialize data (symbol, method, or proc)
|
|
256
|
+
|
|
257
|
+
=== Using factory pool for thread safety
|
|
258
|
+
|
|
259
|
+
[source,ruby]
|
|
260
|
+
----
|
|
261
|
+
pool = factory.pool(5) # Create pool with 5 packers/unpackers
|
|
262
|
+
data = pool.pack(my_object) # Thread-safe packing
|
|
263
|
+
obj = pool.unpack(binary) # Thread-safe unpacking
|
|
264
|
+
----
|
|
265
|
+
|
|
266
|
+
Where,
|
|
267
|
+
|
|
268
|
+
* `factory.pool(size)` creates a thread-safe pool
|
|
269
|
+
* `size` is the number of packer/unpacker instances in the pool
|
|
270
|
+
* The pool automatically manages instance reuse
|
|
271
|
+
* Each thread gets its own instance from the pool
|
|
272
|
+
|
|
273
|
+
.Thread-safe factory usage
|
|
274
|
+
====
|
|
275
|
+
[source,ruby]
|
|
276
|
+
----
|
|
277
|
+
# Create a factory with custom types
|
|
278
|
+
factory = Messagepack::Factory.new
|
|
279
|
+
factory.register_type(0x01, MyCustomClass,
|
|
280
|
+
packer: ->(obj) { obj.serialize },
|
|
281
|
+
unpacker: ->(data) { MyCustomClass.deserialize(data) }
|
|
282
|
+
)
|
|
283
|
+
|
|
284
|
+
# Create a thread-safe pool
|
|
285
|
+
pool = factory.pool(10)
|
|
286
|
+
|
|
287
|
+
# Use from multiple threads safely
|
|
288
|
+
threads = 10.times.map do |i|
|
|
289
|
+
Thread.new do
|
|
290
|
+
object = MyCustomClass.new("data-#{i}")
|
|
291
|
+
binary = pool.pack(object)
|
|
292
|
+
result = pool.unpack(binary)
|
|
293
|
+
result.value
|
|
294
|
+
end
|
|
295
|
+
end
|
|
296
|
+
|
|
297
|
+
puts threads.map(&:value).inspect
|
|
298
|
+
----
|
|
299
|
+
====
|
|
300
|
+
|
|
301
|
+
== Extension types
|
|
302
|
+
|
|
303
|
+
MessagePack supports custom extension types for serializing objects that don't
|
|
304
|
+
have a native MessagePack representation.
|
|
305
|
+
|
|
306
|
+
=== Extension type format
|
|
307
|
+
|
|
308
|
+
[source,ruby]
|
|
309
|
+
----
|
|
310
|
+
factory.register_type(type_id, class,
|
|
311
|
+
packer: packer_specification,
|
|
312
|
+
unpacker: unpacker_specification
|
|
313
|
+
)
|
|
314
|
+
----
|
|
315
|
+
|
|
316
|
+
Where,
|
|
317
|
+
|
|
318
|
+
* `type_id` is an integer from -128 to 127
|
|
319
|
+
* `class` is the Ruby class to serialize
|
|
320
|
+
* `packer_specification` can be:
|
|
321
|
+
* A symbol (method name to call on the object)
|
|
322
|
+
* A proc (called with the object)
|
|
323
|
+
* A method object
|
|
324
|
+
* `unpacker_specification` can be:
|
|
325
|
+
* A symbol (class method to call)
|
|
326
|
+
* A proc (called with the payload data)
|
|
327
|
+
* A method object
|
|
328
|
+
|
|
329
|
+
=== Recursive extension types
|
|
330
|
+
|
|
331
|
+
[source,ruby]
|
|
332
|
+
----
|
|
333
|
+
factory.register_type(0x02, MyContainer,
|
|
334
|
+
packer: ->(obj, packer) { packer.write(obj.to_h) },
|
|
335
|
+
unpacker: ->(unpacker) { MyContainer.from_hash(unpacker.read) },
|
|
336
|
+
recursive: true
|
|
337
|
+
)
|
|
338
|
+
----
|
|
339
|
+
|
|
340
|
+
Where,
|
|
341
|
+
|
|
342
|
+
* `recursive: true` enables nested serialization
|
|
343
|
+
* The `packer` lambda receives the packer instance for recursive calls
|
|
344
|
+
* The `unpacker` lambda receives the unpacker instance for recursive reads
|
|
345
|
+
|
|
346
|
+
.Custom extension type for Money objects
|
|
347
|
+
====
|
|
348
|
+
[source,ruby]
|
|
349
|
+
----
|
|
350
|
+
class Money
|
|
351
|
+
attr_reader :amount, :currency
|
|
352
|
+
|
|
353
|
+
def initialize(amount, currency)
|
|
354
|
+
@amount = amount
|
|
355
|
+
@currency = currency
|
|
356
|
+
end
|
|
357
|
+
|
|
358
|
+
def to_msgpack_ext
|
|
359
|
+
[amount, currency].pack("QA*")
|
|
360
|
+
end
|
|
361
|
+
|
|
362
|
+
def self.from_msgpack_ext(data)
|
|
363
|
+
amount, currency = data.unpack("QA*")
|
|
364
|
+
new(amount, currency)
|
|
365
|
+
end
|
|
366
|
+
end
|
|
367
|
+
|
|
368
|
+
factory = Messagepack::Factory.new
|
|
369
|
+
factory.register_type(0x10, Money,
|
|
370
|
+
packer: :to_msgpack_ext,
|
|
371
|
+
unpacker: :from_msgpack_ext
|
|
372
|
+
)
|
|
373
|
+
|
|
374
|
+
money = Money.new(1000, "USD")
|
|
375
|
+
binary = factory.pack(money)
|
|
376
|
+
result = factory.unpack(binary)
|
|
377
|
+
# => #<Money:0x... @amount=1000, @currency="USD">
|
|
378
|
+
----
|
|
379
|
+
====
|
|
380
|
+
|
|
381
|
+
== Timestamp extension
|
|
382
|
+
|
|
383
|
+
The timestamp extension (type -1) provides nanosecond precision time handling
|
|
384
|
+
for Time objects.
|
|
385
|
+
|
|
386
|
+
=== Timestamp formats
|
|
387
|
+
|
|
388
|
+
.MessagePack automatically selects the appropriate format
|
|
389
|
+
====
|
|
390
|
+
[source]
|
|
391
|
+
----
|
|
392
|
+
Timestamp32 - 4 bytes (seconds only, 32-bit)
|
|
393
|
+
Used when: nanoseconds == 0 and
|
|
394
|
+
seconds fit in 32 bits
|
|
395
|
+
|
|
396
|
+
Timestamp64 - 8 bytes (seconds + nanoseconds)
|
|
397
|
+
Used when: nanoseconds != 0 and
|
|
398
|
+
timestamp fits in 64 bits
|
|
399
|
+
|
|
400
|
+
Timestamp96 - 12 bytes (seconds + nanoseconds, 96-bit)
|
|
401
|
+
Used when: timestamp requires 96 bits
|
|
402
|
+
----
|
|
403
|
+
====
|
|
404
|
+
|
|
405
|
+
=== Using timestamp with Time
|
|
406
|
+
|
|
407
|
+
[source,ruby]
|
|
408
|
+
----
|
|
409
|
+
factory.register_type(-1, Time,
|
|
410
|
+
packer: Messagepack::Time::Packer,
|
|
411
|
+
unpacker: Messagepack::Time::Unpacker
|
|
412
|
+
)
|
|
413
|
+
----
|
|
414
|
+
|
|
415
|
+
Where,
|
|
416
|
+
|
|
417
|
+
* `-1` is the reserved type ID for timestamps
|
|
418
|
+
* `Messagepack::Time::Packer` handles serialization with nanosecond precision
|
|
419
|
+
* `Messagepack::Time::Unpacker` handles deserialization
|
|
420
|
+
|
|
421
|
+
.Timestamp serialization examples
|
|
422
|
+
====
|
|
423
|
+
[source,ruby]
|
|
424
|
+
----
|
|
425
|
+
factory = Messagepack::Factory.new
|
|
426
|
+
factory.register_type(-1, Time,
|
|
427
|
+
packer: Messagepack::Time::Packer,
|
|
428
|
+
unpacker: Messagepack::Time::Unpacker
|
|
429
|
+
)
|
|
430
|
+
|
|
431
|
+
# Current time with nanosecond precision
|
|
432
|
+
now = Time.now
|
|
433
|
+
binary = factory.pack(now)
|
|
434
|
+
restored = factory.unpack(binary)
|
|
435
|
+
puts restored.tv_nsec # Nanoseconds preserved
|
|
436
|
+
|
|
437
|
+
# Historical date
|
|
438
|
+
time = Time.utc(2020, 1, 1, 12, 30, 45)
|
|
439
|
+
binary = factory.pack(time)
|
|
440
|
+
puts binary.size # => 6 (fixext4 format)
|
|
441
|
+
|
|
442
|
+
# Future date with nanoseconds
|
|
443
|
+
future = Time.utc(2100, 6, 15, 0, 0, 0, 123456789)
|
|
444
|
+
binary = factory.pack(future)
|
|
445
|
+
puts binary.size # => 15 (ext8 with timestamp96)
|
|
446
|
+
----
|
|
447
|
+
====
|
|
448
|
+
|
|
449
|
+
== Symbol extension
|
|
450
|
+
|
|
451
|
+
The symbol extension (type 0) provides efficient serialization of Ruby symbols.
|
|
452
|
+
|
|
453
|
+
=== Registering symbol type
|
|
454
|
+
|
|
455
|
+
[source,ruby]
|
|
456
|
+
----
|
|
457
|
+
factory.register_type(0, Symbol)
|
|
458
|
+
----
|
|
459
|
+
|
|
460
|
+
Where,
|
|
461
|
+
|
|
462
|
+
* `0` is the type ID for symbols
|
|
463
|
+
* The extension uses `to_sym` and `to_s` for packing/unpacking
|
|
464
|
+
|
|
465
|
+
=== Symbol serialization
|
|
466
|
+
|
|
467
|
+
[source,ruby]
|
|
468
|
+
----
|
|
469
|
+
factory.register_type(0, Symbol)
|
|
470
|
+
binary = factory.pack(:hello_symbol)
|
|
471
|
+
result = factory.unpack(binary) # => :hello_symbol
|
|
472
|
+
----
|
|
473
|
+
|
|
474
|
+
Where,
|
|
475
|
+
|
|
476
|
+
* Symbols are serialized as their string representation
|
|
477
|
+
* Deserialization converts the string back to a symbol
|
|
478
|
+
* This is more efficient than serializing as strings
|
|
479
|
+
|
|
480
|
+
.Symbol serialization in data structures
|
|
481
|
+
====
|
|
482
|
+
[source,ruby]
|
|
483
|
+
----
|
|
484
|
+
factory = Messagepack::Factory.new
|
|
485
|
+
factory.register_type(0, Symbol)
|
|
486
|
+
|
|
487
|
+
data = {
|
|
488
|
+
status: :active,
|
|
489
|
+
priority: :high,
|
|
490
|
+
tags: [:important, :urgent]
|
|
491
|
+
}
|
|
492
|
+
|
|
493
|
+
binary = factory.pack(data)
|
|
494
|
+
result = factory.unpack(binary)
|
|
495
|
+
# => {:status=>:active, :priority=>:high, :tags=>[:important, :urgent]}
|
|
496
|
+
----
|
|
497
|
+
====
|
|
498
|
+
|
|
499
|
+
== Streaming unpacking
|
|
500
|
+
|
|
501
|
+
The streaming unpacker allows incremental parsing of MessagePack data as it
|
|
502
|
+
becomes available.
|
|
503
|
+
|
|
504
|
+
=== Feeding data incrementally
|
|
505
|
+
|
|
506
|
+
[source,ruby]
|
|
507
|
+
----
|
|
508
|
+
unpacker = Messagepack::Unpacker.new
|
|
509
|
+
unpacker.feed("\x81") # Feed partial data
|
|
510
|
+
unpacker.feed("\xA3") # Feed more
|
|
511
|
+
unpacker.feed("foo") # Feed final part
|
|
512
|
+
obj = unpacker.read # => {"foo"=>nil}
|
|
513
|
+
----
|
|
514
|
+
|
|
515
|
+
Where,
|
|
516
|
+
|
|
517
|
+
* `Unpacker.new` creates a new unpacker instance
|
|
518
|
+
* `feed(data)` appends data to the buffer
|
|
519
|
+
* `read` returns one complete object or `nil` if more data is needed
|
|
520
|
+
|
|
521
|
+
=== Streaming from IO
|
|
522
|
+
|
|
523
|
+
[source,ruby]
|
|
524
|
+
----
|
|
525
|
+
unpacker = Messagepack::Unpacker.new(io)
|
|
526
|
+
obj = unpacker.read # Reads from IO as needed
|
|
527
|
+
----
|
|
528
|
+
|
|
529
|
+
Where,
|
|
530
|
+
|
|
531
|
+
* `Unpacker.new(io)` creates an unpacker attached to an IO
|
|
532
|
+
* The unpacker automatically reads from the IO when needed
|
|
533
|
+
* Use `full_unpack` to read a single object and reset
|
|
534
|
+
|
|
535
|
+
.Streaming unpacking from network
|
|
536
|
+
====
|
|
537
|
+
[source,ruby]
|
|
538
|
+
----
|
|
539
|
+
require 'socket'
|
|
540
|
+
|
|
541
|
+
# Simulate receiving data in chunks
|
|
542
|
+
unpacker = Messagepack::Unpacker.new
|
|
543
|
+
|
|
544
|
+
chunks = ["\x81\xA3", "foo", "\xA5", "world"]
|
|
545
|
+
|
|
546
|
+
chunks.each do |chunk|
|
|
547
|
+
unpacker.feed(chunk)
|
|
548
|
+
obj = unpacker.read
|
|
549
|
+
if obj
|
|
550
|
+
puts "Received: #{obj.inspect}"
|
|
551
|
+
else
|
|
552
|
+
puts "Waiting for more data..."
|
|
553
|
+
end
|
|
554
|
+
end
|
|
555
|
+
|
|
556
|
+
# Output:
|
|
557
|
+
# Waiting for more data...
|
|
558
|
+
# Waiting for more data...
|
|
559
|
+
# Waiting for more data...
|
|
560
|
+
# Received: {"foo"=>"world"}
|
|
561
|
+
----
|
|
562
|
+
====
|
|
563
|
+
|
|
564
|
+
== Buffer management
|
|
565
|
+
|
|
566
|
+
The `BinaryBuffer` class provides efficient chunked storage for binary data.
|
|
567
|
+
|
|
568
|
+
=== Buffer operations
|
|
569
|
+
|
|
570
|
+
[source,ruby]
|
|
571
|
+
----
|
|
572
|
+
buffer = Messagepack::BinaryBuffer.new
|
|
573
|
+
buffer << "data"
|
|
574
|
+
buffer.read(4) # => "data"
|
|
575
|
+
buffer.to_s # => ""
|
|
576
|
+
----
|
|
577
|
+
|
|
578
|
+
Where,
|
|
579
|
+
|
|
580
|
+
* `BinaryBuffer.new` creates a new buffer
|
|
581
|
+
* `<<` appends data to the buffer
|
|
582
|
+
* `read(n)` reads and consumes n bytes
|
|
583
|
+
* `to_s` returns remaining data without consuming
|
|
584
|
+
|
|
585
|
+
=== Skip operations
|
|
586
|
+
|
|
587
|
+
[source,ruby]
|
|
588
|
+
----
|
|
589
|
+
buffer = Messagepack::BinaryBuffer.new
|
|
590
|
+
buffer << "\x81\xA3foo\xA5world"
|
|
591
|
+
buffer.skip # Skip one object (format byte)
|
|
592
|
+
buffer.skip_nil # Skip nil value if present
|
|
593
|
+
----
|
|
594
|
+
|
|
595
|
+
Where,
|
|
596
|
+
|
|
597
|
+
* `skip` skips a complete MessagePack object
|
|
598
|
+
* `skip_nil` efficiently skips nil values
|
|
599
|
+
|
|
600
|
+
=== Buffer with IO
|
|
601
|
+
|
|
602
|
+
[source,ruby]
|
|
603
|
+
----
|
|
604
|
+
File.open("data.msgpack", "rb") do |io|
|
|
605
|
+
buffer = Messagepack::BinaryBuffer.new(io)
|
|
606
|
+
unpacker = Messagepack::Unpacker.new(buffer)
|
|
607
|
+
obj = unpacker.read
|
|
608
|
+
end
|
|
609
|
+
----
|
|
610
|
+
|
|
611
|
+
Where,
|
|
612
|
+
|
|
613
|
+
* The buffer reads from the IO when needed
|
|
614
|
+
* Data is automatically managed in chunks
|
|
615
|
+
* Suitable for large files that don't fit in memory
|
|
616
|
+
|
|
617
|
+
.Reading large MessagePack files efficiently
|
|
618
|
+
====
|
|
619
|
+
[source,ruby]
|
|
620
|
+
----
|
|
621
|
+
# Process a large file without loading everything into memory
|
|
622
|
+
buffer = Messagepack::BinaryBuffer.new(File.open("large.msgpack", "rb"))
|
|
623
|
+
unpacker = Messagepack::Unpacker.new(buffer)
|
|
624
|
+
|
|
625
|
+
while obj = unpacker.read
|
|
626
|
+
# Process each object one at a time
|
|
627
|
+
process(obj)
|
|
628
|
+
end
|
|
629
|
+
----
|
|
630
|
+
====
|
|
631
|
+
|
|
632
|
+
== Performance optimizations
|
|
633
|
+
|
|
634
|
+
This implementation includes several performance optimizations that make the pure
|
|
635
|
+
Ruby implementation efficient for typical use cases.
|
|
636
|
+
|
|
637
|
+
=== Native type fast-path
|
|
638
|
+
|
|
639
|
+
Native MessagePack types (nil, boolean, integer, float, string, symbol, array, hash)
|
|
640
|
+
bypass the extension registry lookup for optimal performance:
|
|
641
|
+
|
|
642
|
+
* Native types are identified without O(n) registry search
|
|
643
|
+
* Native types with custom extension registrations still use the registry
|
|
644
|
+
* Custom types pay the registry lookup cost as expected
|
|
645
|
+
|
|
646
|
+
This means that even with many registered extension types, packing native objects
|
|
647
|
+
remains fast.
|
|
648
|
+
|
|
649
|
+
=== Buffer chunk coalescing
|
|
650
|
+
|
|
651
|
+
The buffer uses automatic chunk coalescing to reduce memory allocations and improve
|
|
652
|
+
throughput:
|
|
653
|
+
|
|
654
|
+
* Small writes (< 512 bytes) are merged into larger chunks
|
|
655
|
+
* Reduces the number of string objects in memory
|
|
656
|
+
* Improves `to_s` performance by reducing chunk count
|
|
657
|
+
* Optimized for common patterns like many small integer writes
|
|
658
|
+
|
|
659
|
+
=== Buffer read optimization
|
|
660
|
+
|
|
661
|
+
The buffer's `to_s` method has a fast-path for when reading from the beginning
|
|
662
|
+
(position 0), which is the common case for packers:
|
|
663
|
+
|
|
664
|
+
* Uses `join` for efficient string concatenation
|
|
665
|
+
* Skips offset calculations when position is at 0
|
|
666
|
+
* Significantly faster for single-pass operations
|
|
667
|
+
|
|
668
|
+
.Performance comparison
|
|
669
|
+
====
|
|
670
|
+
[source,ruby]
|
|
671
|
+
----
|
|
672
|
+
# Native type performance (unaffected by registry size)
|
|
673
|
+
Messagepack.pack(nil) # ~673k ops/sec
|
|
674
|
+
Messagepack.pack(42) # ~607k ops/sec
|
|
675
|
+
Messagepack.pack("hello") # ~498k ops/sec
|
|
676
|
+
Messagepack.pack([1,2,3]) # ~230k ops/sec
|
|
677
|
+
Messagepack.pack({a: 1, b: 2}) # ~159k ops/sec
|
|
678
|
+
|
|
679
|
+
# Buffer operations
|
|
680
|
+
# With coalescing: 1000 small writes = ~4.7k ops/sec
|
|
681
|
+
# Without coalescing: ~3.7k ops/sec (+28% improvement)
|
|
682
|
+
----
|
|
683
|
+
|
|
684
|
+
====
|
|
685
|
+
|
|
686
|
+
== Implementation details
|
|
687
|
+
|
|
688
|
+
=== Pure Ruby architecture
|
|
689
|
+
|
|
690
|
+
This implementation is written entirely in Ruby without any C extensions, providing:
|
|
691
|
+
|
|
692
|
+
* **Portability** - Runs on any Ruby implementation (MRI, JRuby, TruffleRuby, etc.)
|
|
693
|
+
* **Safety** - No memory corruption risks from native code
|
|
694
|
+
* **Debuggability** - Easy to debug with standard Ruby tools
|
|
695
|
+
* **Maintainability** - Pure Ruby code is easier to understand and modify
|
|
696
|
+
|
|
697
|
+
=== Binary buffer design
|
|
698
|
+
|
|
699
|
+
The `BinaryBuffer` class uses a chunked storage design:
|
|
700
|
+
|
|
701
|
+
[source]
|
|
702
|
+
----
|
|
703
|
+
BinaryBuffer
|
|
704
|
+
├── Chunks (array)
|
|
705
|
+
│ ├── Chunk 1 (data)
|
|
706
|
+
│ ├── Chunk 2 (data)
|
|
707
|
+
│ └── Chunk N (data)
|
|
708
|
+
├── Position (read cursor)
|
|
709
|
+
└── Length (total bytes)
|
|
710
|
+
----
|
|
711
|
+
|
|
712
|
+
Where,
|
|
713
|
+
|
|
714
|
+
* `Chunks` - Array of binary strings holding data
|
|
715
|
+
* `Position` - Current read position across all chunks
|
|
716
|
+
* `Length` - Total bytes across all chunks
|
|
717
|
+
* Coalescing threshold - Small writes (< 512 bytes) are merged
|
|
718
|
+
|
|
719
|
+
This design provides:
|
|
720
|
+
|
|
721
|
+
* **Efficient appends** - New data creates chunks, small writes merge
|
|
722
|
+
* **Zero-copy reads** - Data is read without copying when possible
|
|
723
|
+
* **Memory efficiency** - Unused chunks can be garbage collected
|
|
724
|
+
* **IO integration** - Can read from IO objects on demand
|
|
725
|
+
|
|
726
|
+
=== Extension registry
|
|
727
|
+
|
|
728
|
+
The extension registry provides type mapping for custom serialization:
|
|
729
|
+
|
|
730
|
+
[source]
|
|
731
|
+
----
|
|
732
|
+
ExtensionRegistry::Packer
|
|
733
|
+
├── @registry - Hash of class => [type_id, proc, flags]
|
|
734
|
+
└── @cache - Hash of class => [type_id, proc, flags] (ancestor cache)
|
|
735
|
+
|
|
736
|
+
ExtensionRegistry::Unpacker
|
|
737
|
+
└── @array - Array[256] of [class, proc, flags] indexed by type_id
|
|
738
|
+
----
|
|
739
|
+
|
|
740
|
+
Where,
|
|
741
|
+
|
|
742
|
+
* Packer registry uses O(1) hash lookup for direct class matches
|
|
743
|
+
* Ancestor search is O(n) but cached after first lookup
|
|
744
|
+
* Unpacker registry uses O(1) array lookup by type ID
|
|
745
|
+
* Flags control recursive packing and oversized integer handling
|
|
746
|
+
|
|
747
|
+
=== Type dispatch
|
|
748
|
+
|
|
749
|
+
The packer uses a type dispatch system for efficient serialization:
|
|
750
|
+
|
|
751
|
+
[source]
|
|
752
|
+
----
|
|
753
|
+
Packer#write(value)
|
|
754
|
+
├── Fast-path check (native type?)
|
|
755
|
+
│ ├── Yes → Skip registry, use native serialization
|
|
756
|
+
│ └── No → Check registry
|
|
757
|
+
│ ├── Found in registry → Use extension packer
|
|
758
|
+
│ └── Not found → Check to_msgpack method
|
|
759
|
+
└── Case statement dispatch → Type-specific writer
|
|
760
|
+
----
|
|
761
|
+
|
|
762
|
+
This ensures:
|
|
763
|
+
|
|
764
|
+
* Native types are serialized without overhead
|
|
765
|
+
* Registered custom types use their packers
|
|
766
|
+
* Unknown types can implement `to_msgpack` for compatibility
|
|
767
|
+
|
|
768
|
+
|
|
769
|
+
== Copyright and license
|
|
770
|
+
|
|
771
|
+
Copyright Ribose. All rights reserved.
|
|
772
|
+
|
|
773
|
+
Licensed under the MIT License.
|