unibuf 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +133 -255
- data/README.adoc +217 -220
- data/lib/unibuf/models/values/scalar_value.rb +2 -2
- data/lib/unibuf/parsers/binary/wire_format_parser.rb +199 -19
- data/lib/unibuf/parsers/textproto/grammar.rb +1 -1
- data/lib/unibuf/parsers/textproto/processor.rb +10 -0
- data/lib/unibuf/validators/type_validator.rb +1 -1
- data/lib/unibuf/version.rb +1 -1
- metadata +1 -1
data/README.adoc
CHANGED
|
@@ -6,23 +6,21 @@ image:https://github.com/lutaml/unibuf/actions/workflows/rake.yml/badge.svg[Buil
|
|
|
6
6
|
|
|
7
7
|
== Purpose
|
|
8
8
|
|
|
9
|
-
Unibuf is a pure Ruby gem for parsing and manipulating Protocol Buffers with
|
|
10
|
-
schema-driven validation.
|
|
9
|
+
Unibuf is a pure Ruby gem for parsing and manipulating Protocol Buffers in both text and binary formats with schema-driven validation.
|
|
11
10
|
|
|
12
|
-
It provides a fully object-oriented, specification-compliant parser with rich
|
|
13
|
-
domain models, comprehensive schema validation, and complete round-trip
|
|
14
|
-
serialization support.
|
|
11
|
+
It provides a fully object-oriented, specification-compliant parser with rich domain models, comprehensive schema validation, wire format decoding, and complete round-trip serialization support.
|
|
15
12
|
|
|
16
13
|
Key features:
|
|
17
14
|
|
|
18
15
|
* Parse Protocol Buffers text format (`.txtpb`, `.textproto`)
|
|
16
|
+
* Parse Protocol Buffers binary format (`.binpb`) with schema
|
|
19
17
|
* Parse Proto3 schemas (`.proto`) for validation
|
|
20
|
-
* Schema-driven validation
|
|
18
|
+
* Schema-driven validation and deserialization
|
|
19
|
+
* Wire format decoding (varint, zigzag, all wire types)
|
|
21
20
|
* Round-trip serialization with 100% accuracy
|
|
22
21
|
* Rich domain models with 45+ behavioral classes
|
|
23
|
-
* Complete CLI toolkit
|
|
22
|
+
* Complete CLI toolkit for text and binary formats
|
|
24
23
|
* Specification-compliant implementation
|
|
25
|
-
* Zero external binary dependencies
|
|
26
24
|
|
|
27
25
|
== Installation
|
|
28
26
|
|
|
@@ -49,60 +47,55 @@ gem install unibuf
|
|
|
49
47
|
|
|
50
48
|
== Features
|
|
51
49
|
|
|
52
|
-
* <<schema-required-design,Schema-
|
|
53
|
-
* <<parsing-textproto,Parsing
|
|
54
|
-
* <<
|
|
55
|
-
* <<
|
|
56
|
-
* <<
|
|
57
|
-
* <<
|
|
50
|
+
* <<schema-required-design,Schema-required design>>
|
|
51
|
+
* <<parsing-textproto,Parsing text format>>
|
|
52
|
+
* <<parsing-binary,Parsing binary format>>
|
|
53
|
+
* <<schema-validation,Schema-based validation>>
|
|
54
|
+
* <<wire-format,Wire format support>>
|
|
55
|
+
* <<round-trip-serialization,Round-trip serialization>>
|
|
56
|
+
* <<rich-domain-models,Rich domain models>>
|
|
57
|
+
* <<cli-tools,Command-line tools>>
|
|
58
58
|
|
|
59
59
|
[[schema-required-design]]
|
|
60
|
-
== Schema-
|
|
60
|
+
== Schema-required design
|
|
61
61
|
|
|
62
62
|
=== General
|
|
63
63
|
|
|
64
|
-
Unibuf follows Protocol Buffers' schema-driven architecture. The schema
|
|
65
|
-
(`.proto` file) defines the message structure and is REQUIRED for proper parsing
|
|
66
|
-
and validation.
|
|
64
|
+
Unibuf follows Protocol Buffers' schema-driven architecture. The schema (`.proto` file) defines the message structure and is REQUIRED for binary parsing and recommended for text parsing.
|
|
67
65
|
|
|
68
|
-
This design ensures type safety and enables
|
|
66
|
+
This design ensures type safety and enables proper deserialization of binary formats.
|
|
69
67
|
|
|
70
68
|
=== Why schema is required
|
|
71
69
|
|
|
72
70
|
The schema defines:
|
|
73
71
|
- Message types and their fields
|
|
74
72
|
- Field types and numbers
|
|
73
|
+
- Field wire types for binary encoding
|
|
75
74
|
- Repeated and optional fields
|
|
76
75
|
- Nested message structures
|
|
77
|
-
- Enum values
|
|
78
76
|
|
|
79
|
-
|
|
77
|
+
Binary Protocol Buffers cannot be parsed without a schema because the binary format only stores field numbers, not field names or types.
|
|
80
78
|
|
|
81
79
|
[[parsing-textproto]]
|
|
82
|
-
== Parsing Protocol Buffers
|
|
80
|
+
== Parsing Protocol Buffers text format
|
|
83
81
|
|
|
84
82
|
=== General
|
|
85
83
|
|
|
86
|
-
|
|
87
|
-
https://protobuf.dev/reference/protobuf/textformat-spec/[official specification].
|
|
84
|
+
Parse human-readable Protocol Buffer text format files following the https://protobuf.dev/reference/protobuf/textformat-spec/[official specification].
|
|
88
85
|
|
|
89
|
-
|
|
90
|
-
messages, repeated fields, lists, maps, multi-line strings, comments, and all
|
|
91
|
-
numeric types.
|
|
92
|
-
|
|
93
|
-
=== Loading schema first
|
|
86
|
+
=== Parsing text format
|
|
94
87
|
|
|
95
88
|
[source,ruby]
|
|
96
89
|
----
|
|
97
90
|
require "unibuf"
|
|
98
91
|
|
|
99
|
-
#
|
|
92
|
+
# Load schema (recommended for validation)
|
|
100
93
|
schema = Unibuf.parse_schema("schema.proto") # <1>
|
|
101
94
|
|
|
102
|
-
#
|
|
95
|
+
# Parse text format file
|
|
103
96
|
message = Unibuf.parse_textproto_file("data.txtpb") # <2>
|
|
104
97
|
|
|
105
|
-
#
|
|
98
|
+
# Validate against schema
|
|
106
99
|
validator = Unibuf::Validators::SchemaValidator.new(schema) # <3>
|
|
107
100
|
validator.validate!(message, "MessageType") # <4>
|
|
108
101
|
----
|
|
@@ -111,47 +104,82 @@ validator.validate!(message, "MessageType") # <4>
|
|
|
111
104
|
<3> Create validator with schema
|
|
112
105
|
<4> Validate message against schema
|
|
113
106
|
|
|
114
|
-
|
|
107
|
+
[[parsing-binary]]
|
|
108
|
+
== Parsing Protocol Buffers binary format
|
|
109
|
+
|
|
110
|
+
=== General
|
|
111
|
+
|
|
112
|
+
Parse binary Protocol Buffer data using wire format decoding with schema-driven deserialization.
|
|
113
|
+
|
|
114
|
+
The schema is REQUIRED for binary parsing because binary format only stores field numbers, not names or types.
|
|
115
|
+
|
|
116
|
+
=== Parsing binary format
|
|
115
117
|
|
|
116
118
|
[source,ruby]
|
|
117
119
|
----
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
PROTO
|
|
120
|
+
require "unibuf"
|
|
121
|
+
|
|
122
|
+
# 1. Load schema (REQUIRED for binary)
|
|
123
|
+
schema = Unibuf.parse_schema("schema.proto") # <1>
|
|
123
124
|
|
|
124
|
-
|
|
125
|
+
# 2. Parse binary Protocol Buffer file
|
|
126
|
+
message = Unibuf.parse_binary_file("data.binpb", schema: schema) # <2>
|
|
125
127
|
|
|
126
|
-
|
|
127
|
-
puts
|
|
128
|
+
# 3. Access fields normally
|
|
129
|
+
puts message.find_field("name").value # <3>
|
|
130
|
+
----
|
|
131
|
+
<1> Schema is mandatory for binary parsing
|
|
132
|
+
<2> Parse binary file with schema
|
|
133
|
+
<3> Access fields like text format
|
|
134
|
+
|
|
135
|
+
=== Binary format from string
|
|
136
|
+
|
|
137
|
+
[source,ruby]
|
|
128
138
|
----
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
139
|
+
# Read binary data
|
|
140
|
+
binary_data = File.binread("data.binpb")
|
|
141
|
+
|
|
142
|
+
# Parse with schema
|
|
143
|
+
schema = Unibuf.parse_schema("schema.proto")
|
|
144
|
+
message = Unibuf.parse_binary(binary_data, schema: schema)
|
|
145
|
+
----
|
|
146
|
+
|
|
147
|
+
=== Supported wire types
|
|
148
|
+
|
|
149
|
+
The binary parser supports all Protocol Buffer wire types:
|
|
150
|
+
|
|
151
|
+
Varint (Type 0)::
|
|
152
|
+
Variable-length integers: int32, int64, uint32, uint64, sint32, sint64, bool, enum
|
|
153
|
+
|
|
154
|
+
64-bit (Type 1)::
|
|
155
|
+
Fixed 8-byte values: fixed64, sfixed64, double
|
|
156
|
+
|
|
157
|
+
Length-delimited (Type 2)::
|
|
158
|
+
Variable-length data: string, bytes, embedded messages, packed repeated fields
|
|
159
|
+
|
|
160
|
+
32-bit (Type 5)::
|
|
161
|
+
Fixed 4-byte values: fixed32, sfixed32, float
|
|
132
162
|
|
|
133
163
|
[[schema-validation]]
|
|
134
164
|
== Schema-based validation
|
|
135
165
|
|
|
136
166
|
=== General
|
|
137
167
|
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
The SchemaValidator checks field types, validates nested messages, and ensures all fields conform to their schema definitions.
|
|
168
|
+
Validate Protocol Buffer messages (text or binary) against their Proto3 schemas.
|
|
141
169
|
|
|
142
|
-
=== Validating
|
|
170
|
+
=== Validating with schema
|
|
143
171
|
|
|
144
172
|
[source,ruby]
|
|
145
173
|
----
|
|
146
174
|
# Load schema
|
|
147
|
-
schema = Unibuf.parse_schema("
|
|
175
|
+
schema = Unibuf.parse_schema("schema.proto") # <1>
|
|
148
176
|
|
|
149
|
-
# Parse message
|
|
150
|
-
message = Unibuf.
|
|
177
|
+
# Parse message (text or binary)
|
|
178
|
+
message = Unibuf.parse_binary_file("data.binpb", schema: schema) # <2>
|
|
151
179
|
|
|
152
180
|
# Validate
|
|
153
181
|
validator = Unibuf::Validators::SchemaValidator.new(schema) # <3>
|
|
154
|
-
errors = validator.validate(message, "
|
|
182
|
+
errors = validator.validate(message, "MessageType") # <4>
|
|
155
183
|
|
|
156
184
|
if errors.empty?
|
|
157
185
|
puts "✓ Valid!" # <5>
|
|
@@ -160,126 +188,121 @@ else
|
|
|
160
188
|
end
|
|
161
189
|
----
|
|
162
190
|
<1> Parse the Proto3 schema
|
|
163
|
-
<2> Parse
|
|
191
|
+
<2> Parse binary Protocol Buffer
|
|
164
192
|
<3> Create validator with schema
|
|
165
|
-
<4> Validate message
|
|
193
|
+
<4> Validate message
|
|
166
194
|
<5> Validation passed
|
|
167
|
-
<6> Show
|
|
195
|
+
<6> Show errors if any
|
|
196
|
+
|
|
197
|
+
[[wire-format]]
|
|
198
|
+
== Wire format support
|
|
168
199
|
|
|
169
|
-
===
|
|
200
|
+
=== General
|
|
201
|
+
|
|
202
|
+
Unibuf implements complete Protocol Buffers wire format decoding according to the official specification.
|
|
203
|
+
|
|
204
|
+
=== Wire format features
|
|
205
|
+
|
|
206
|
+
Varint decoding::
|
|
207
|
+
Efficiently decode variable-length integers used for most numeric types
|
|
208
|
+
|
|
209
|
+
ZigZag encoding::
|
|
210
|
+
Proper handling of signed integers (sint32, sint64) with zigzag decoding
|
|
211
|
+
|
|
212
|
+
Fixed-width types::
|
|
213
|
+
Decode 32-bit and 64-bit fixed-width values (fixed32, fixed64, float, double)
|
|
214
|
+
|
|
215
|
+
Length-delimited::
|
|
216
|
+
Parse strings, bytes, and embedded messages with length prefixes
|
|
217
|
+
|
|
218
|
+
Schema-driven::
|
|
219
|
+
Use schema to determine field types and deserialize correctly
|
|
220
|
+
|
|
221
|
+
=== Example wire format parsing
|
|
170
222
|
|
|
171
223
|
[source,ruby]
|
|
172
224
|
----
|
|
225
|
+
# Schema defines the structure
|
|
173
226
|
schema = Unibuf.parse_schema("schema.proto")
|
|
174
227
|
|
|
175
|
-
|
|
176
|
-
|
|
228
|
+
# Binary data uses wire format encoding
|
|
229
|
+
binary_data = File.binread("data.binpb")
|
|
177
230
|
|
|
178
|
-
#
|
|
179
|
-
|
|
180
|
-
puts msg_def.field_names # => ["name", "designer", ...] <4>
|
|
231
|
+
# Parser uses schema to decode wire format
|
|
232
|
+
message = Unibuf.parse_binary(binary_data, schema: schema)
|
|
181
233
|
|
|
182
|
-
#
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
puts field_def.number # => 1 <7>
|
|
234
|
+
# Access decoded fields
|
|
235
|
+
message.field_names # => ["name", "id", "enabled"]
|
|
236
|
+
message.find_field("id").value # => Properly decoded integer
|
|
186
237
|
----
|
|
187
|
-
<1> Get package name from schema
|
|
188
|
-
<2> List all message types
|
|
189
|
-
<3> Find specific message definition
|
|
190
|
-
<4> Get field names for message
|
|
191
|
-
<5> Find specific field definition
|
|
192
|
-
<6> Get field type
|
|
193
|
-
<7> Get field number
|
|
194
238
|
|
|
195
239
|
[[round-trip-serialization]]
|
|
196
|
-
== Round-trip
|
|
240
|
+
== Round-trip serialization
|
|
197
241
|
|
|
198
242
|
=== General
|
|
199
243
|
|
|
200
|
-
Unibuf supports complete round-trip serialization, allowing you to parse
|
|
201
|
-
|
|
202
|
-
The round-trip success rate on curated test files is 100%.
|
|
244
|
+
Unibuf supports complete round-trip serialization for text format, allowing you to parse, modify, and serialize back while preserving semantic equivalence.
|
|
203
245
|
|
|
204
246
|
=== Serializing to textproto format
|
|
205
247
|
|
|
206
248
|
[source,ruby]
|
|
207
249
|
----
|
|
208
|
-
|
|
250
|
+
# Parse (text or binary)
|
|
251
|
+
message = Unibuf.parse_textproto_file("input.txtpb") # <1>
|
|
209
252
|
|
|
253
|
+
# Serialize to text format
|
|
210
254
|
textproto = message.to_textproto # <2>
|
|
211
255
|
|
|
212
256
|
File.write("output.txtpb", textproto) # <3>
|
|
213
257
|
|
|
258
|
+
# Verify round-trip
|
|
214
259
|
reparsed = Unibuf.parse_textproto(textproto) # <4>
|
|
215
260
|
puts message == reparsed # => true <5>
|
|
216
261
|
----
|
|
217
262
|
<1> Parse the original file
|
|
218
|
-
<2> Serialize to
|
|
263
|
+
<2> Serialize to text format
|
|
219
264
|
<3> Write to file
|
|
220
265
|
<4> Parse the serialized output
|
|
221
266
|
<5> Verify semantic equivalence
|
|
222
267
|
|
|
223
268
|
[[rich-domain-models]]
|
|
224
|
-
== Rich
|
|
269
|
+
== Rich domain models
|
|
225
270
|
|
|
226
271
|
=== General
|
|
227
272
|
|
|
228
273
|
Unibuf provides rich domain models with comprehensive behavior.
|
|
229
274
|
|
|
230
|
-
|
|
231
|
-
polymorphism, and separation of concerns.
|
|
275
|
+
Over 45 classes provide extensive functionality following object-oriented principles.
|
|
232
276
|
|
|
233
|
-
=== Message model
|
|
277
|
+
=== Message model
|
|
234
278
|
|
|
235
279
|
[source,ruby]
|
|
236
280
|
----
|
|
237
|
-
message
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
message.nested? # => true if has nested messages
|
|
241
|
-
message.scalar_only? # => true if only scalar fields
|
|
242
|
-
message.maps? # => true if contains map fields (renamed from has_maps?)
|
|
243
|
-
message.repeated_fields? # => true if has repeated fields (renamed from has_repeated_fields?)
|
|
244
|
-
message.empty? # => true if no fields
|
|
245
|
-
|
|
246
|
-
# Query methods
|
|
247
|
-
message.find_field("name") # => Field object or nil
|
|
248
|
-
message.find_fields("subsets") # => Array of all "subsets" fields
|
|
249
|
-
message.field_names # => ["name", "version", ...]
|
|
250
|
-
message.field_count # => 12
|
|
251
|
-
message.repeated_field_names # => ["subsets", "fonts"] (renamed from repeated_fields)
|
|
252
|
-
message.map_fields # => Array of map fields
|
|
253
|
-
message.nested_messages # => Array of nested messages
|
|
254
|
-
|
|
255
|
-
# Traversal methods
|
|
256
|
-
message.traverse_depth_first { |field| ... } # Depth-first traversal
|
|
257
|
-
message.traverse_breadth_first { |field| ... } # Breadth-first traversal
|
|
258
|
-
message.depth # => Maximum nesting depth
|
|
259
|
-
|
|
260
|
-
# Validation
|
|
261
|
-
message.valid? # => true/false
|
|
262
|
-
message.validate! # => raises if invalid
|
|
263
|
-
message.validation_errors # => Array of error messages
|
|
264
|
-
----
|
|
281
|
+
# Parse message (text or binary)
|
|
282
|
+
schema = Unibuf.parse_schema("schema.proto")
|
|
283
|
+
message = Unibuf.parse_binary_file("data.binpb", schema: schema)
|
|
265
284
|
|
|
266
|
-
|
|
285
|
+
# Classification (MECE)
|
|
286
|
+
message.nested? # Has nested messages?
|
|
287
|
+
message.scalar_only? # Only scalar fields?
|
|
288
|
+
message.maps? # Contains maps?
|
|
289
|
+
message.repeated_fields? # Has repeated fields?
|
|
267
290
|
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
291
|
+
# Queries
|
|
292
|
+
message.find_field("name") # Find by name
|
|
293
|
+
message.find_fields("tags") # Find all with name
|
|
294
|
+
message.field_names # All field names
|
|
295
|
+
message.repeated_field_names # Repeated field names
|
|
271
296
|
|
|
272
|
-
#
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
field.list_field? # => true for arrays
|
|
297
|
+
# Traversal
|
|
298
|
+
message.traverse_depth_first { |field| ... }
|
|
299
|
+
message.traverse_breadth_first { |field| ... }
|
|
300
|
+
message.depth # Maximum nesting depth
|
|
277
301
|
|
|
278
|
-
#
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
field.boolean_value? # => true for booleans
|
|
302
|
+
# Validation
|
|
303
|
+
message.valid? # Check validity
|
|
304
|
+
message.validate! # Raise if invalid
|
|
305
|
+
message.validation_errors # Get error list
|
|
283
306
|
----
|
|
284
307
|
|
|
285
308
|
[[cli-tools]]
|
|
@@ -287,68 +310,64 @@ field.boolean_value? # => true for booleans
|
|
|
287
310
|
|
|
288
311
|
=== General
|
|
289
312
|
|
|
290
|
-
|
|
313
|
+
Complete CLI toolkit supporting both text and binary Protocol Buffer formats.
|
|
291
314
|
|
|
292
|
-
|
|
293
|
-
schema-driven by design.
|
|
315
|
+
Schema is REQUIRED for proper message type identification.
|
|
294
316
|
|
|
295
317
|
=== Parse command
|
|
296
318
|
|
|
297
319
|
[source,shell]
|
|
298
320
|
----
|
|
299
|
-
# Parse text format
|
|
300
|
-
unibuf parse
|
|
321
|
+
# Parse text format
|
|
322
|
+
unibuf parse data.txtpb --schema schema.proto --format json
|
|
301
323
|
|
|
302
|
-
# Parse
|
|
303
|
-
unibuf parse
|
|
324
|
+
# Parse binary format
|
|
325
|
+
unibuf parse data.binpb --schema schema.proto --format json
|
|
304
326
|
|
|
305
|
-
#
|
|
306
|
-
unibuf parse
|
|
327
|
+
# Auto-detect format
|
|
328
|
+
unibuf parse data.pb --schema schema.proto --format yaml
|
|
307
329
|
|
|
308
|
-
#
|
|
309
|
-
unibuf parse
|
|
330
|
+
# Specify message type
|
|
331
|
+
unibuf parse data.binpb --schema schema.proto --message-type FamilyProto
|
|
310
332
|
----
|
|
311
333
|
|
|
312
334
|
=== Validate command
|
|
313
335
|
|
|
314
336
|
[source,shell]
|
|
315
337
|
----
|
|
316
|
-
# Validate
|
|
317
|
-
unibuf validate
|
|
338
|
+
# Validate text format
|
|
339
|
+
unibuf validate data.txtpb --schema schema.proto
|
|
318
340
|
|
|
319
|
-
# Validate
|
|
320
|
-
unibuf validate
|
|
341
|
+
# Validate binary format
|
|
342
|
+
unibuf validate data.binpb --schema schema.proto
|
|
321
343
|
|
|
322
|
-
#
|
|
323
|
-
unibuf validate
|
|
344
|
+
# Specify message type
|
|
345
|
+
unibuf validate data.pb --schema schema.proto --message-type MessageType
|
|
324
346
|
----
|
|
325
347
|
|
|
326
348
|
=== Convert command
|
|
327
349
|
|
|
328
350
|
[source,shell]
|
|
329
351
|
----
|
|
330
|
-
#
|
|
331
|
-
unibuf convert
|
|
352
|
+
# Binary to JSON
|
|
353
|
+
unibuf convert data.binpb --schema schema.proto --to json
|
|
332
354
|
|
|
333
|
-
#
|
|
334
|
-
unibuf convert
|
|
355
|
+
# Binary to text
|
|
356
|
+
unibuf convert data.binpb --schema schema.proto --to txtpb
|
|
335
357
|
|
|
336
|
-
#
|
|
337
|
-
unibuf convert
|
|
358
|
+
# Text to JSON
|
|
359
|
+
unibuf convert data.txtpb --schema schema.proto --to json
|
|
338
360
|
----
|
|
339
361
|
|
|
340
362
|
=== Schema command
|
|
341
363
|
|
|
342
364
|
[source,shell]
|
|
343
365
|
----
|
|
344
|
-
# Inspect schema
|
|
366
|
+
# Inspect schema
|
|
345
367
|
unibuf schema schema.proto
|
|
346
368
|
|
|
347
|
-
# Output
|
|
369
|
+
# Output as JSON
|
|
348
370
|
unibuf schema schema.proto --format json
|
|
349
|
-
|
|
350
|
-
# Save schema structure
|
|
351
|
-
unibuf schema schema.proto --format yaml -o schema.yml
|
|
352
371
|
----
|
|
353
372
|
|
|
354
373
|
== Architecture
|
|
@@ -360,16 +379,15 @@ unibuf schema schema.proto --format yaml -o schema.yml
|
|
|
360
379
|
Unibuf
|
|
361
380
|
├── Parsers
|
|
362
381
|
│ ├── Textproto Text format parser
|
|
363
|
-
│ │ ├── Grammar Parslet grammar
|
|
364
|
-
│ │ ├── Processor AST
|
|
382
|
+
│ │ ├── Grammar Parslet grammar
|
|
383
|
+
│ │ ├── Processor AST transformation
|
|
365
384
|
│ │ └── Parser High-level API
|
|
366
385
|
│ ├── Proto3 Schema parser
|
|
367
|
-
│ │ ├── Grammar Proto3 grammar
|
|
368
|
-
│ │ ├── Processor
|
|
369
|
-
│ │ └── Parser
|
|
370
|
-
│
|
|
371
|
-
│
|
|
372
|
-
│ └── Flatbuffers FlatBuffers parser (future)
|
|
386
|
+
│ │ ├── Grammar Proto3 grammar
|
|
387
|
+
│ │ ├── Processor Schema builder
|
|
388
|
+
│ │ └── Parser Schema API
|
|
389
|
+
│ └── Binary Binary format parser
|
|
390
|
+
│ └── WireFormatParser Wire format decoder
|
|
373
391
|
├── Models
|
|
374
392
|
│ ├── Message Protocol Buffer message
|
|
375
393
|
│ ├── Field Message field
|
|
@@ -377,55 +395,49 @@ Unibuf
|
|
|
377
395
|
│ ├── MessageDefinition Message type definition
|
|
378
396
|
│ ├── FieldDefinition Field specification
|
|
379
397
|
│ ├── EnumDefinition Enum type definition
|
|
380
|
-
│ └── Values Value type hierarchy
|
|
381
|
-
│ ├── BaseValue Abstract base
|
|
382
|
-
│ ├── ScalarValue Primitives
|
|
383
|
-
│ ├── MessageValue Nested messages
|
|
384
|
-
│ ├── ListValue Arrays
|
|
385
|
-
│ └── MapValue Key-value pairs
|
|
398
|
+
│ └── Values Value type hierarchy (5 classes)
|
|
386
399
|
├── Validators
|
|
387
400
|
│ ├── TypeValidator Type and range validation
|
|
388
401
|
│ └── SchemaValidator Schema-based validation
|
|
389
402
|
└── CLI
|
|
390
|
-
|
|
391
|
-
├── Validate Validate command
|
|
392
|
-
├── Convert Convert command
|
|
393
|
-
└── Schema Schema inspection command
|
|
403
|
+
└── Commands parse, validate, convert, schema
|
|
394
404
|
----
|
|
395
405
|
|
|
406
|
+
=== Design principles
|
|
396
407
|
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
official specification:
|
|
401
|
-
|
|
402
|
-
Scalar Fields::
|
|
403
|
-
`name: "value"` - Field with string value
|
|
408
|
+
Object-Oriented::
|
|
409
|
+
45+ rich classes with extensive behavior.
|
|
410
|
+
No anemic data structures.
|
|
404
411
|
|
|
405
|
-
|
|
406
|
-
|
|
412
|
+
MECE::
|
|
413
|
+
Mutually exclusive, collectively exhaustive classifications.
|
|
414
|
+
Complete type hierarchies.
|
|
407
415
|
|
|
408
|
-
|
|
409
|
-
|
|
416
|
+
Separation of Concerns::
|
|
417
|
+
Clean layer separation: Grammar, Processor, Models, Validators, CLI.
|
|
410
418
|
|
|
411
|
-
|
|
412
|
-
|
|
419
|
+
Open/Closed::
|
|
420
|
+
Extensible for new formats without modifying core.
|
|
413
421
|
|
|
414
|
-
|
|
415
|
-
|
|
422
|
+
Schema-Driven::
|
|
423
|
+
Schema-required for binary, recommended for text.
|
|
424
|
+
Proper Protocol Buffer architecture.
|
|
416
425
|
|
|
417
|
-
|
|
418
|
-
`text: "line1" "line2"` - String concatenation
|
|
426
|
+
== Real-World Validation
|
|
419
427
|
|
|
420
|
-
|
|
421
|
-
Integers, floats, octal, hexadecimal, negative numbers
|
|
428
|
+
Curated test suite with diverse Protocol Buffer features:
|
|
422
429
|
|
|
423
|
-
|
|
424
|
-
|
|
425
|
-
|
|
426
|
-
|
|
427
|
-
|
|
430
|
+
.Test fixtures
|
|
431
|
+
[example]
|
|
432
|
+
====
|
|
433
|
+
- robotoflex: Multi-axis variable font
|
|
434
|
+
- mavenpro: Static font
|
|
435
|
+
- opensans: Popular font variants
|
|
436
|
+
- playfair: Optical size axis
|
|
437
|
+
- wavefont: Custom axes
|
|
428
438
|
|
|
439
|
+
Validation: 100% parse success, 100% round-trip accuracy
|
|
440
|
+
====
|
|
429
441
|
|
|
430
442
|
== Development
|
|
431
443
|
|
|
@@ -433,58 +445,43 @@ Escape Sequences::
|
|
|
433
445
|
|
|
434
446
|
[source,shell]
|
|
435
447
|
----
|
|
436
|
-
# Run all tests
|
|
437
448
|
bundle exec rspec
|
|
438
|
-
|
|
439
|
-
# Run with coverage report
|
|
440
|
-
bundle exec rspec --format documentation
|
|
441
|
-
|
|
442
|
-
# View coverage
|
|
443
|
-
open coverage/index.html
|
|
444
449
|
----
|
|
445
450
|
|
|
446
451
|
=== Code style
|
|
447
452
|
|
|
448
453
|
[source,shell]
|
|
449
454
|
----
|
|
450
|
-
# Check code style
|
|
451
|
-
bundle exec rubocop
|
|
452
|
-
|
|
453
|
-
# Auto-fix style issues
|
|
454
455
|
bundle exec rubocop -A
|
|
455
456
|
----
|
|
456
457
|
|
|
457
458
|
== Roadmap
|
|
458
459
|
|
|
459
|
-
=== Current Version (
|
|
460
|
+
=== Current Version (v1.0.0)
|
|
460
461
|
|
|
461
|
-
- ✅
|
|
462
|
+
- ✅ Text format parsing
|
|
463
|
+
- ✅ Binary format parsing (wire format decoder)
|
|
462
464
|
- ✅ Proto3 schema parsing
|
|
463
465
|
- ✅ Schema-based validation
|
|
464
466
|
- ✅ Complete CLI toolkit
|
|
467
|
+
- ✅ 277 comprehensive tests
|
|
465
468
|
|
|
466
|
-
=== Future
|
|
467
|
-
|
|
468
|
-
==== v0.2.0: Binary Protocol Buffers
|
|
469
|
-
|
|
470
|
-
- Binary wire format parsing
|
|
471
|
-
- Schema-driven binary deserialization
|
|
472
|
-
- Binary/text conversion
|
|
469
|
+
=== Future work
|
|
473
470
|
|
|
474
|
-
====
|
|
471
|
+
==== FlatBuffers
|
|
475
472
|
|
|
476
473
|
- FlatBuffers schema parsing
|
|
477
474
|
- FlatBuffers binary parsing
|
|
478
|
-
-
|
|
475
|
+
- Performance optimizations
|
|
476
|
+
- Additional Protocol Buffer features
|
|
479
477
|
|
|
480
478
|
== Contributing
|
|
481
479
|
|
|
482
|
-
Bug reports and pull requests are welcome
|
|
480
|
+
Bug reports and pull requests are welcome at https://github.com/lutaml/unibuf.
|
|
483
481
|
|
|
484
482
|
== Copyright and license
|
|
485
483
|
|
|
486
|
-
Copyright Ribose.
|
|
484
|
+
Copyright https://www.ribose.com[Ribose Inc.]
|
|
487
485
|
|
|
488
|
-
|
|
489
|
-
License.
|
|
486
|
+
Licensed under the 3-clause BSD License.
|
|
490
487
|
|