@zio.dev/zio-blocks 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,640 @@
1
+ ---
2
+ id: formats
3
+ title: "Serialization Formats"
4
+ ---
5
+
6
+ ZIO Blocks Schema provides automatic codec derivation for multiple serialization formats. Once you have a `Schema[A]` for your data type, you can derive codecs for any supported format using the unified `Schema.derive(Format)` pattern.
7
+
8
+ ## Overview: Codec Derivation System
9
+
10
+ All serialization formats in ZIO Blocks follow the same pattern: given a `Schema[A]`, you derive a codec by calling `derive` with a format object:
11
+
12
+ ```scala mdoc:compile-only
13
+ import zio.blocks.schema._
14
+ import zio.blocks.schema.toon._
15
+
16
+ case class Person(name: String, age: Int)
17
+
18
+ object Person {
19
+ implicit val schema: Schema[Person] = Schema.derived
20
+ }
21
+
22
+ // Derive codec for any format (using TOON as an example)
23
+ val codec = Schema[Person].derive(ToonFormat)
24
+
25
+ // Encode to bytes
26
+ val bytes: Array[Byte] = codec.encode(Person("Alice", 30))
27
+
28
+ // Decode from bytes
29
+ val result: Either[SchemaError, Person] = codec.decode(bytes)
30
+ ```
31
+
32
+ Each format provides a `BinaryFormat` object that can be passed to `derive`:
33
+
34
+ | Format | Object | MIME Type | Platform Support |
35
+ |--------|--------|-----------|------------------|
36
+ | JSON | `JsonFormat` | `application/json` | JVM, JS |
37
+ | TOON | `ToonFormat` | `text/toon` | JVM, JS |
38
+ | MessagePack | `MessagePackFormat` | `application/msgpack` | JVM, JS |
39
+ | Avro | `AvroFormat` | `application/avro` | JVM only |
40
+ | Thrift | `ThriftFormat` | `application/thrift` | JVM only |
41
+ | BSON | `BsonSchemaCodec` | Binary | JVM only |
42
+
43
+ ## JSON Format
44
+
45
+ JSON format is the most commonly used text-based serialization format. See the dedicated [JSON documentation](json.md) for comprehensive coverage of the `Json` ADT, navigation, and transformation features.
46
+
47
+ ### Installation
48
+
49
+ JSON support is included in the core schema module:
50
+
51
+ ```scala
52
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema" % "<version>"
53
+ ```
54
+
55
+ ### Basic Usage
56
+
57
+ ```scala mdoc:compile-only
58
+ import zio.blocks.schema._
59
+ import zio.blocks.schema.json._
60
+
61
+ case class Person(name: String, age: Int)
62
+
63
+ object Person {
64
+ implicit val schema: Schema[Person] = Schema.derived
65
+ }
66
+
67
+ // Using JsonEncoder/JsonDecoder
68
+ val jsonEncoder = JsonEncoder[Person]
69
+ val jsonDecoder = JsonDecoder[Person]
70
+
71
+ val person = Person("Alice", 30)
72
+ val json: Json = jsonEncoder.encode(person)
73
+ // {"name":"Alice","age":30}
74
+
75
+ val decoded: Either[SchemaError, Person] = jsonDecoder.decode(json)
76
+ ```
77
+
78
+ ## Avro Format
79
+
80
+ Apache Avro is a compact binary format with schema evolution support, commonly used in big data systems like Kafka and Spark.
81
+
82
+ ### Installation
83
+
84
+ ```scala
85
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema-avro" % "<version>"
86
+ ```
87
+
88
+ Requires the Apache Avro library (1.12.x).
89
+
90
+ ### Basic Usage
91
+
92
+ ```scala mdoc:compile-only
93
+ import zio.blocks.schema._
94
+ import zio.blocks.schema.avro._
95
+
96
+ case class Person(name: String, age: Int)
97
+
98
+ object Person {
99
+ implicit val schema: Schema[Person] = Schema.derived
100
+ }
101
+
102
+ // Derive Avro codec
103
+ val codec = Schema[Person].derive(AvroFormat)
104
+
105
+ // Encode to Avro binary format
106
+ val person = Person("Alice", 30)
107
+ val bytes: Array[Byte] = codec.encode(person)
108
+
109
+ // Decode from Avro binary format
110
+ val decoded: Either[SchemaError, Person] = codec.decode(bytes)
111
+ ```
112
+
113
+ ### Avro Schema Generation
114
+
115
+ Each `AvroBinaryCodec` exposes an `avroSchema` property containing the Apache Avro schema:
116
+
117
+ ```scala mdoc:compile-only
118
+ import zio.blocks.schema._
119
+ import zio.blocks.schema.avro._
120
+ import org.apache.avro.{Schema => AvroSchema}
121
+
122
+ case class Person(name: String, age: Int)
123
+
124
+ object Person {
125
+ implicit val schema: Schema[Person] = Schema.derived
126
+ }
127
+
128
+ val codec = Schema[Person].derive(AvroFormat)
129
+ val avroSchema: AvroSchema = codec.avroSchema
130
+ println(avroSchema.toString(true))
131
+ // {
132
+ // "type": "record",
133
+ // "name": "Person",
134
+ // "fields": [
135
+ // {"name": "name", "type": "string"},
136
+ // {"name": "age", "type": "int"}
137
+ // ]
138
+ // }
139
+ ```
140
+
141
+ ### Avro Type Mappings
142
+
143
+ | Scala Type | Avro Type |
144
+ |------------|-----------|
145
+ | `Boolean` | `boolean` |
146
+ | `Byte`, `Short`, `Int` | `int` |
147
+ | `Long` | `long` |
148
+ | `Float` | `float` |
149
+ | `Double` | `double` |
150
+ | `String`, `Char` | `string` |
151
+ | `BigInt` | `bytes` |
152
+ | `BigDecimal` | Record (mantissa, scale, precision, roundingMode) |
153
+ | `UUID` | 16-byte fixed |
154
+ | `Currency` | 3-byte fixed |
155
+ | `java.time.*` | Records or primitives |
156
+ | Case classes | `record` |
157
+ | Sealed traits | `union` |
158
+ | `List[A]`, `Set[A]` | `array` |
159
+ | `Map[String, V]` | `map` |
160
+
161
+ ### ADT Encoding
162
+
163
+ Sealed traits are encoded as Avro unions with an integer index prefix:
164
+
165
+ ```scala mdoc:compile-only
166
+ import zio.blocks.schema._
167
+ import zio.blocks.schema.avro._
168
+
169
+ sealed trait Shape
170
+ case class Circle(radius: Double) extends Shape
171
+ case class Rectangle(width: Double, height: Double) extends Shape
172
+
173
+ object Shape {
174
+ implicit val schema: Schema[Shape] = Schema.derived
175
+ }
176
+
177
+ val codec = Schema[Shape].derive(AvroFormat)
178
+
179
+ // The variant index (0 for Circle, 1 for Rectangle) is written first,
180
+ // followed by the record data
181
+ val circle: Shape = Circle(5.0)
182
+ val bytes = codec.encode(circle)
183
+ ```
184
+
185
+ ## TOON Format (LLM-Optimized)
186
+
187
+ TOON (Token-Oriented Object Notation) is a line-oriented, indentation-based text format that encodes the JSON data model with explicit structure and minimal quoting. It is 30-60% more compact than JSON, making it particularly efficient for LLM prompts and responses.
188
+
189
+ ### Why TOON?
190
+
191
+ - **Token efficient**: 30-60% fewer tokens than equivalent JSON
192
+ - **Human readable**: Clean, YAML-like syntax without YAML's complexity
193
+ - **LLM optimized**: Designed for AI/ML use cases where token count matters
194
+ - **Explicit lengths**: Arrays declare their size upfront for reliable parsing
195
+ - **Cross-platform**: Works on JVM and Scala.js
196
+
197
+ ### Installation
198
+
199
+ ```scala
200
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema-toon" % "<version>"
201
+ ```
202
+
203
+ ### Basic Usage
204
+
205
+ ```scala mdoc:compile-only
206
+ import zio.blocks.schema._
207
+ import zio.blocks.schema.toon._
208
+
209
+ case class Person(name: String, age: Int)
210
+
211
+ object Person {
212
+ implicit val schema: Schema[Person] = Schema.derived
213
+ }
214
+
215
+ // Derive TOON codec
216
+ val codec = Schema[Person].derive(ToonFormat)
217
+
218
+ // Encode to TOON
219
+ val person = Person("Alice", 30)
220
+ val bytes: Array[Byte] = codec.encode(person)
221
+ // name: Alice
222
+ // age: 30
223
+
224
+ // Decode from TOON
225
+ val decoded: Either[SchemaError, Person] = codec.decode(bytes)
226
+ ```
227
+
228
+ ### TOON Format Examples
229
+
230
+ TOON uses indentation and explicit array lengths:
231
+
232
+ ```
233
+ # Simple object
234
+ name: Alice
235
+ age: 30
236
+ email: alice@example.com
237
+
238
+ # Inline primitive arrays (comma-separated)
239
+ tags[3]: scala,zio,functional
240
+
241
+ # Nested object
242
+ address:
243
+ street: 123 Main St
244
+ city: Springfield
245
+
246
+ # Object arrays use list format
247
+ orders[2]:
248
+ - id: 1
249
+ total: 99.99
250
+ - id: 2
251
+ total: 149.5
252
+
253
+ # Or tabular format (more compact)
254
+ orders[2]{id,total}:
255
+ 1,99.99
256
+ 2,149.5
257
+ ```
258
+
259
+ ### Configuration Options
260
+
261
+ The `ToonBinaryCodecDeriver` provides extensive configuration:
262
+
263
+ ```scala mdoc:compile-only
264
+ import zio.blocks.schema._
265
+ import zio.blocks.schema.toon._
266
+
267
+ case class Person(firstName: String, lastName: String)
268
+ object Person {
269
+ implicit val schema: Schema[Person] = Schema.derived
270
+ }
271
+
272
+ // Custom deriver with snake_case field names
273
+ val customDeriver = ToonBinaryCodecDeriver
274
+ .withFieldNameMapper(NameMapper.SnakeCase)
275
+ .withArrayFormat(ArrayFormat.Tabular)
276
+ .withDiscriminatorKind(DiscriminatorKind.Field("type"))
277
+
278
+ val codec = Schema[Person].derive(customDeriver)
279
+ // first_name: Alice
280
+ // last_name: Smith
281
+ ```
282
+
283
+ | Option | Description | Default |
284
+ |--------|-------------|---------|
285
+ | `withFieldNameMapper` | Transform field names (Identity, SnakeCase, KebabCase) | `Identity` |
286
+ | `withCaseNameMapper` | Transform variant/case names | `Identity` |
287
+ | `withDiscriminatorKind` | ADT discriminator style (Key, Field, None) | `Key` |
288
+ | `withArrayFormat` | Array encoding (Auto, Tabular, Inline, List) | `Auto` |
289
+ | `withDelimiter` | Inline array delimiter (Comma, Tab, Pipe) | `Comma` |
290
+ | `withRejectExtraFields` | Error on unknown fields during decoding | `false` |
291
+ | `withEnumValuesAsStrings` | Encode enum values as strings | `true` |
292
+ | `withTransientNone` | Omit None values from output | `true` |
293
+ | `withTransientEmptyCollection` | Omit empty collections | `true` |
294
+ | `withTransientDefaultValue` | Omit fields with default values | `true` |
295
+
296
+ ### ADT Encoding Styles
297
+
298
+ ```scala mdoc:compile-only
299
+ import zio.blocks.schema._
300
+ import zio.blocks.schema.toon._
301
+
302
+ sealed trait Shape
303
+ case class Circle(radius: Double) extends Shape
304
+
305
+ object Shape {
306
+ implicit val schema: Schema[Shape] = Schema.derived
307
+ }
308
+
309
+ // Key discriminator (default)
310
+ val keyCodec = Schema[Shape].derive(ToonFormat)
311
+ // Circle:
312
+ // radius: 5
313
+
314
+ // Field discriminator
315
+ val fieldDeriver = ToonBinaryCodecDeriver
316
+ .withDiscriminatorKind(DiscriminatorKind.Field("type"))
317
+ val fieldCodec = Schema[Shape].derive(fieldDeriver)
318
+ // type: Circle
319
+ // radius: 5
320
+ ```
321
+
322
+ ## MessagePack Format
323
+
324
+ MessagePack is an efficient binary serialization format that is more compact than JSON while remaining schema-less and cross-language compatible.
325
+
326
+ ### Installation
327
+
328
+ ```scala
329
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema-messagepack" % "<version>"
330
+ ```
331
+
332
+ ### Basic Usage
333
+
334
+ ```scala mdoc:compile-only
335
+ import zio.blocks.schema._
336
+ import zio.blocks.schema.msgpack._
337
+
338
+ case class Person(name: String, age: Int)
339
+
340
+ object Person {
341
+ implicit val schema: Schema[Person] = Schema.derived
342
+ }
343
+
344
+ // Derive MessagePack codec
345
+ val codec = Schema[Person].derive(MessagePackFormat)
346
+
347
+ // Encode to MessagePack
348
+ val person = Person("Alice", 30)
349
+ val bytes: Array[Byte] = codec.encode(person)
350
+
351
+ // Decode from MessagePack
352
+ val decoded: Either[SchemaError, Person] = codec.decode(bytes)
353
+ ```
354
+
355
+ ### Binary Efficiency
356
+
357
+ MessagePack provides significant space savings compared to JSON:
358
+
359
+ - Typically 50-80% of JSON size
360
+ - Uses variable-width integer encoding
361
+ - No string escaping overhead
362
+ - No key quoting or colons/commas
363
+
364
+ ### MessagePack Type Mappings
365
+
366
+ | Scala Type | MessagePack Type |
367
+ |------------|------------------|
368
+ | `Unit` | nil |
369
+ | `Boolean` | bool |
370
+ | `Byte`, `Short`, `Int`, `Long` | int (variable width) |
371
+ | `Float` | float32 |
372
+ | `Double` | float64 |
373
+ | `String`, `Char` | str |
374
+ | `Array[Byte]` | bin |
375
+ | `List[A]`, `Vector[A]`, `Set[A]` | array |
376
+ | `Map[K, V]` | map |
377
+ | `Option[A]` | array (0 or 1 element) |
378
+ | `Either[A, B]` | map with "left" or "right" key |
379
+ | Case classes | map with field names as keys |
380
+ | Sealed traits | int index followed by value |
381
+
382
+ ### ADT Encoding
383
+
384
+ Sealed traits encode a variant index followed by the case value:
385
+
386
+ ```scala mdoc:compile-only
387
+ import zio.blocks.schema._
388
+ import zio.blocks.schema.msgpack._
389
+
390
+ sealed trait Shape
391
+ case class Circle(radius: Double) extends Shape
392
+ case class Rectangle(width: Double, height: Double) extends Shape
393
+
394
+ object Shape {
395
+ implicit val schema: Schema[Shape] = Schema.derived
396
+ }
397
+
398
+ val codec = Schema[Shape].derive(MessagePackFormat)
399
+
400
+ // Circle is encoded as: 0 followed by {radius: 5.0}
401
+ val circle: Shape = Circle(5.0)
402
+ val bytes = codec.encode(circle)
403
+ ```
404
+
405
+ ## BSON Format
406
+
407
+ BSON (Binary JSON) is the binary format used by MongoDB. The ZIO Blocks BSON module provides integration with the MongoDB BSON library.
408
+
409
+ ### Installation
410
+
411
+ ```scala
412
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema-bson" % "<version>"
413
+ ```
414
+
415
+ Requires the MongoDB BSON library (5.x).
416
+
417
+ ### Basic Usage
418
+
419
+ ```scala mdoc:compile-only
420
+ import zio.blocks.schema._
421
+ import zio.blocks.schema.bson._
422
+
423
+ case class Person(name: String, age: Int)
424
+
425
+ object Person {
426
+ implicit val schema: Schema[Person] = Schema.derived
427
+ }
428
+
429
+ // Derive BSON encoder/decoder
430
+ val encoder: BsonEncoder[Person] = BsonSchemaCodec.bsonEncoder(Schema[Person])
431
+ val decoder: BsonDecoder[Person] = BsonSchemaCodec.bsonDecoder(Schema[Person])
432
+
433
+ // Or get both as a codec
434
+ val codec: BsonCodec[Person] = BsonSchemaCodec.bsonCodec(Schema[Person])
435
+ ```
436
+
437
+ ### MongoDB ObjectId Support
438
+
439
+ BSON provides native support for MongoDB ObjectIds:
440
+
441
+ ```scala mdoc:compile-only
442
+ import zio.blocks.schema._
443
+ import zio.blocks.schema.bson._
444
+ import org.bson.types.ObjectId
445
+
446
+ // Import ObjectId schema
447
+ import ObjectIdSupport.objectIdSchema
448
+
449
+ case class Document(_id: ObjectId, title: String)
450
+
451
+ object Document {
452
+ implicit val schema: Schema[Document] = Schema.derived
453
+ }
454
+
455
+ // ObjectId is encoded using BSON's native OBJECT_ID type
456
+ val codec = BsonSchemaCodec.bsonCodec(Schema[Document])
457
+ ```
458
+
459
+ ### Configuration Options
460
+
461
+ ```scala mdoc:compile-only
462
+ import zio.blocks.schema._
463
+ import zio.blocks.schema.bson._
464
+ import BsonSchemaCodec._
465
+
466
+ case class Person(name: String, age: Int)
467
+ object Person {
468
+ implicit val schema: Schema[Person] = Schema.derived
469
+ }
470
+
471
+ // Custom configuration
472
+ val config = Config
473
+ .withSumTypeHandling(SumTypeHandling.DiscriminatorField("_type"))
474
+ .withIgnoreExtraFields(true)
475
+ .withNativeObjectId(true)
476
+
477
+ val codec = BsonSchemaCodec.bsonCodec(Schema[Person], config)
478
+ ```
479
+
480
+ | Option | Description | Default |
481
+ |--------|-------------|---------|
482
+ | `withSumTypeHandling` | ADT discrimination strategy | `WrapperWithClassNameField` |
483
+ | `withClassNameMapping` | Transform class names | `identity` |
484
+ | `withIgnoreExtraFields` | Ignore unknown fields on decode | `true` |
485
+ | `withNativeObjectId` | Use native BSON ObjectId type | `false` |
486
+
487
+ ### Sum Type Handling
488
+
489
+ ```scala mdoc:compile-only
490
+ import zio.blocks.schema.bson.BsonSchemaCodec.SumTypeHandling
491
+
492
+ // Option 1: Wrapper with class name as field key (default)
493
+ SumTypeHandling.WrapperWithClassNameField
494
+ // {"Circle": {"radius": 5.0}}
495
+
496
+ // Option 2: Discriminator field
497
+ SumTypeHandling.DiscriminatorField("_type")
498
+ // {"_type": "Circle", "radius": 5.0}
499
+
500
+ // Option 3: No discriminator (tries each case)
501
+ SumTypeHandling.NoDiscriminator
502
+ ```
503
+
504
+ ## Thrift Format
505
+
506
+ Apache Thrift is a binary protocol format with field ID-based encoding, supporting forward-compatible schema evolution.
507
+
508
+ ### Installation
509
+
510
+ ```scala
511
+ libraryDependencies += "dev.zio" %% "zio-blocks-schema-thrift" % "<version>"
512
+ ```
513
+
514
+ Requires the Apache Thrift library (0.22.x).
515
+
516
+ ### Basic Usage
517
+
518
+ ```scala mdoc:compile-only
519
+ import zio.blocks.schema._
520
+ import zio.blocks.schema.thrift._
521
+ import java.nio.ByteBuffer
522
+
523
+ case class Person(name: String, age: Int)
524
+
525
+ object Person {
526
+ implicit val schema: Schema[Person] = Schema.derived
527
+ }
528
+
529
+ // Derive Thrift codec
530
+ val codec = Schema[Person].derive(ThriftFormat)
531
+
532
+ // Encode to Thrift binary format
533
+ val person = Person("Alice", 30)
534
+ val bytes: Array[Byte] = codec.encode(person)
535
+
536
+ // Decode from Thrift binary format
537
+ val decoded: Either[SchemaError, Person] = codec.decode(bytes)
538
+
539
+ // ByteBuffer API
540
+ val buffer = ByteBuffer.allocate(1024)
541
+ codec.encode(person, buffer)
542
+ buffer.flip()
543
+ val fromBuffer: Either[SchemaError, Person] = codec.decode(buffer)
544
+ ```
545
+
546
+ ### Thrift-Specific Features
547
+
548
+ - **Field ID-based encoding**: Uses 1-based field IDs corresponding to case class field positions
549
+ - **Forward compatibility**: Unknown fields are skipped during decoding
550
+ - **Out-of-order decoding**: Fields can arrive in any order on the wire
551
+ - **TBinaryProtocol**: Uses the standard Thrift binary protocol
552
+
553
+ ### Thrift Type Mappings
554
+
555
+ | Scala Type | Thrift Type |
556
+ |------------|-------------|
557
+ | `Unit` | VOID |
558
+ | `Boolean` | BOOL |
559
+ | `Byte` | BYTE |
560
+ | `Short`, `Char` | I16 |
561
+ | `Int` | I32 |
562
+ | `Long` | I64 |
563
+ | `Float`, `Double` | DOUBLE |
564
+ | `String` | STRING |
565
+ | `BigInt` | Binary (STRING) |
566
+ | `BigDecimal` | STRUCT |
567
+ | `java.time.*` | STRING (ISO format) or I32 |
568
+ | `List[A]` | LIST |
569
+ | `Map[K, V]` | MAP |
570
+ | Case classes | STRUCT |
571
+ | Sealed traits | Indexed variant |
572
+
573
+ ## Supported Types
574
+
575
+ All formats support the full set of ZIO Blocks Schema primitive types:
576
+
577
+ **Numeric Types**:
578
+ - `Boolean`, `Byte`, `Short`, `Int`, `Long`, `Float`, `Double`, `Char`
579
+ - `BigInt`, `BigDecimal`
580
+
581
+ **Text Types**:
582
+ - `String`
583
+
584
+ **Special Types**:
585
+ - `Unit`, `UUID`, `Currency`
586
+
587
+ **Java Time Types**:
588
+ - `Instant`, `LocalDate`, `LocalTime`, `LocalDateTime`
589
+ - `OffsetTime`, `OffsetDateTime`, `ZonedDateTime`
590
+ - `Duration`, `Period`
591
+ - `Year`, `YearMonth`, `MonthDay`
592
+ - `DayOfWeek`, `Month`
593
+ - `ZoneId`, `ZoneOffset`
594
+
595
+ **Composite Types**:
596
+ - Records (case classes)
597
+ - Variants (sealed traits)
598
+ - Sequences (`List`, `Vector`, `Set`, `Array`, etc.)
599
+ - Maps (`Map[K, V]`)
600
+ - Options (`Option[A]`)
601
+ - Eithers (`Either[A, B]`)
602
+ - Wrappers (newtypes)
603
+
604
+ ## Cross-Platform Support
605
+
606
+ | Format | JVM | Scala.js |
607
+ |--------|-----|----------|
608
+ | JSON | ✓ | ✓ |
609
+ | TOON | ✓ | ✓ |
610
+ | MessagePack | ✓ | ✓ |
611
+ | Avro | ✓ | ✗ |
612
+ | Thrift | ✓ | ✗ |
613
+ | BSON | ✓ | ✗ |
614
+
615
+ ## Error Handling
616
+
617
+ All formats return `Either[SchemaError, A]` for decoding operations. Errors include path information for debugging:
618
+
619
+ ```scala mdoc:compile-only
620
+ import zio.blocks.schema._
621
+ import zio.blocks.schema.toon._
622
+
623
+ case class Person(name: String, age: Int)
624
+ object Person {
625
+ implicit val schema: Schema[Person] = Schema.derived
626
+ }
627
+
628
+ val codec = Schema[Person].derive(ToonFormat)
629
+
630
+ // Example: decoding invalid bytes
631
+ val invalidBytes = "invalid: data\nwrong: format".getBytes
632
+ val result = codec.decode(invalidBytes)
633
+
634
+ result match {
635
+ case Right(person) => println(s"Decoded: $person")
636
+ case Left(error) =>
637
+ // SchemaError includes information about the decode failure
638
+ error.errors.foreach(e => println(s"Error: ${e.message}"))
639
+ }
640
+ ```