@toon-format/spec 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,87 +5,83 @@ All notable changes to the TOON specification will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [2.1] - 2025-11-23
9
+
10
+ ### Changed
11
+
12
+ - Canonical encoding for objects as list items (§10):
13
+ - Encoders SHOULD emit `- key[N]{fields}:` only when the list-item object has exactly one field and that field is a tabular array.
14
+ - In all other cases, encoders SHOULD emit a bare `-` line and place all fields at depth +1; tabular array headers then appear at depth +1 and their rows at depth +2.
15
+
8
16
  ## [2.0] - 2025-11-10
9
17
 
10
18
  ### Breaking Changes
11
19
 
12
- - **Removed:** Length marker (`#`) prefix in array headers has been completely removed from the specification
13
- - The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only
14
- - Encoders MUST NOT emit `[#N]` format
15
- - Decoders MUST NOT accept `[#N]` format (breaking change from v1.5)
20
+ - Removed `[#N]` length-marker syntax in array headers; `[N]` is now the only valid format.
21
+ - Encoders MUST NOT emit `[#N]`; decoders MUST reject it.
16
22
 
17
23
  ### Removed
18
24
 
19
- - All references to length marker from terminology (§1.4), header syntax (§6), ABNF grammar, conformance requirements (§13.2), and parsing helpers (Appendix B)
20
- - `lengthMarker` encoder option removed from all implementations
21
- - Length marker test fixtures removed
25
+ - The `lengthMarker` encoder option and any CLI flags exposing it.
22
26
 
23
27
  ### Migration from v1.5
24
28
 
25
- - Update decoder implementations to reject `[#N]` syntax
26
- - Convert any existing `.toon` files using `[#N]` format to `[N]` format
27
- - Remove `lengthMarker` option from encoder configurations
28
- - Remove `--length-marker` CLI flags if present
29
+ - Update decoders to reject `[#N]` syntax.
30
+ - Convert existing `.toon` files using `[#N]` to `[N]`.
31
+ - Remove `lengthMarker` configuration and CLI options.
29
32
 
30
33
  ## [1.5] - 2025-11-08
31
34
 
32
35
  ### Added
33
36
 
34
- - Optional key folding for encoders: `keyFolding="safe"` mode with `flattenDepth` control to collapse single-key object chains into dotted-path notation (§13.4)
35
- - Optional path expansion for decoders: `expandPaths="safe"` mode to split dotted keys into nested objects, with conflict resolution tied to `strict` option (§13.4, §14.5)
36
- - IdentifierSegment terminology and path separator definition (fixed to `"."` in v1.5) (§1.9)
37
- - Deep-merge semantics for path expansion: recursive merge for objects, error on conflict when `strict=true`, last-write-wins (LWW) when `strict=false` (§13.4)
37
+ - Optional key folding for encoders: `keyFolding="safe"` with `flattenDepth` to collapse single-key object chains into dotted paths (§13.4).
38
+ - Optional path expansion for decoders: `expandPaths="safe"` to split dotted keys into nested objects with deep-merge semantics and conflict handling tied to `strict` (§13.4, §14.5).
39
+ - IdentifierSegment terminology and fixed `"."` path separator for safe folding/expansion (§1.9).
38
40
 
39
41
  ### Changed
40
42
 
41
- - Both new features default to OFF and are fully backward-compatible
42
- - Safe-mode folding requires IdentifierSegment validation, collision avoidance, and no quoting
43
+ - Safe-mode folding requires IdentifierSegment-only segments, no path separator in segments, no quoting, and collision avoidance.
44
+ - Both features default to `off` and are backward-compatible.
43
45
 
44
46
  ## [1.4] - 2025-11-05
45
47
 
46
48
  ### Changed
47
49
 
48
- - Removed JavaScript-specific normalization details from specification; replaced with language-agnostic requirements (Section 3)
49
- - Defined canonical number format for encoders: no exponent notation, no trailing zeros, no leading zeros except "0" (Section 2)
50
- - Clarified decoder handling of exponent notation and out-of-range numbers (Section 2)
51
- - Expanded `\w` regex notation to explicit character class `[A-Za-z0-9_]` for cross-language clarity (Section 7.3)
52
- - Clarified non-strict mode tab handling as implementation-defined (Section 12)
50
+ - Generalized normalization rules and defined canonical number format for encoders (no exponent notation, no trailing zeros, no leading zeros except `"0"`), plus decoder handling of exponent forms and out-of-range numbers (§2-§3).
51
+ - Replaced `\w` with explicit `[A-Za-z0-9_]` in key regexes for cross-language clarity (§7.3).
52
+ - Clarified non-strict mode tab handling as implementation-defined (§12).
53
53
 
54
54
  ### Added
55
55
 
56
- - Appendix G: Host Type Normalization Examples with guidance for Go, JavaScript, Python, and Rust implementations
56
+ - Appendix G with host-type normalization examples for Go, JavaScript, Python, and Rust.
57
57
 
58
58
  ## [1.3] - 2025-10-31
59
59
 
60
60
  ### Added
61
61
 
62
- - Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (15-17 digits), all implementations MUST preserve round-trip fidelity (Section 2)
63
- - RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (Section 6)
62
+ - Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (1517 digits); all implementations MUST preserve round-trip fidelity (§2).
63
+ - RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (§6).
64
64
 
65
65
  ## [1.2] - 2025-10-29
66
66
 
67
67
  ### Changed
68
68
 
69
- - Clarified delimiter scoping behavior between array headers
70
- - Tightened strict-mode indentation requirements: leading spaces MUST be exact multiples of indentSize; tabs in indentation MUST error
71
- - Defined blank-line and trailing-newline decoding behavior with explicit skipping rules outside arrays
72
- - Clarified hyphen-based quoting: "-" or any string starting with "-" MUST be quoted
73
- - Clarified BigInt normalization: values outside safe integer range are converted to quoted decimal strings
74
- - Clarified row/key disambiguation: uses first unquoted delimiter vs colon position
69
+ - Tightened delimiter scoping, indentation, blank-line handling, and hyphen-based quoting rules (§11-§12).
70
+ - Clarified BigInt normalization (out-of-range values quoted decimal strings) and row/key disambiguation (first unquoted delimiter vs colon) (§2, §9.3).
75
71
 
76
72
  ## [1.1] - 2025-10-29
77
73
 
78
74
  ### Added
79
75
 
80
- - Strict-mode rules
81
- - Delimiter-aware parsing
82
- - Decoder options (indent, strict)
76
+ - Strict-mode rules.
77
+ - Delimiter-aware parsing.
78
+ - Decoder options (`indent`, `strict`).
83
79
 
84
80
  ## [1.0] - 2025-10-28
85
81
 
86
82
  ### Added
87
83
 
88
- - Initial specification release
89
- - Encoding normalization rules
90
- - Decoding interpretation guidelines
91
- - Conformance requirements
84
+ - Initial specification release.
85
+ - Encoding normalization rules.
86
+ - Decoding interpretation guidelines.
87
+ - Conformance requirements.
package/README.md CHANGED
@@ -1,16 +1,16 @@
1
1
  # TOON Format Specification
2
2
 
3
- [![SPEC v2.0](https://img.shields.io/badge/spec-v2.0-lightgrey)](./SPEC.md)
4
- [![Tests](https://img.shields.io/badge/tests-340-green)](./tests/fixtures/)
3
+ [![SPEC v2.1](https://img.shields.io/badge/spec-v2.1-lightgrey)](./SPEC.md)
4
+ [![Tests](https://img.shields.io/badge/tests-344-green)](./tests/fixtures/)
5
5
  [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
6
6
 
7
- This repository contains the official specification for **Token-Oriented Object Notation (TOON)**, a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.
7
+ This repository contains the official specification for **Token-Oriented Object Notation (TOON)**, a compact, human-readable encoding of the JSON data model for LLM prompts. It provides a lossless serialization of the same objects, arrays, and primitives as JSON, but in a syntax that minimizes tokens and makes structure easy for models to follow.
8
8
 
9
9
  ## 📋 Specification
10
10
 
11
11
  [→ Read the full specification (SPEC.md)](./SPEC.md)
12
12
 
13
- - **Version:** 2.0 (2025-11-10)
13
+ - **Version:** 2.1 (2025-11-23)
14
14
  - **Status:** Working Draft
15
15
  - **License:** MIT
16
16
 
@@ -18,38 +18,76 @@ The specification includes complete grammar (ABNF), encoding rules, validation r
18
18
 
19
19
  ## What is TOON?
20
20
 
21
- **Token-Oriented Object Notation** is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage. It's intended for LLM input, not output.
21
+ > [!IMPORTANT]
22
+ > For a high-level overview of TOON, its features and benefits, design goals, and comparisons to other formats, see the [`toon-format/toon` repository](https://github.com/toon-format/toon).
22
23
 
23
- TOON's sweet spot is **uniform arrays of objects** – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts. For deeply nested or non-uniform data, JSON may be more efficient.
24
+ ## Serialization Example
24
25
 
25
- **Key Features:**
26
-
27
- - 💸 **Token-efficient:** typically 30–60% fewer tokens than JSON
28
- - 🤿 **LLM-friendly guardrails:** explicit lengths and fields enable validation
29
- - 🍱 **Minimal syntax:** removes redundant punctuation (braces, brackets, most quotes)
30
- - 📐 **Indentation-based structure:** like YAML, uses whitespace instead of braces
31
- - 🧺 **Tabular arrays:** declare keys once, stream data as rows
32
-
33
- ## Quick Example
34
-
35
- **JSON:**
26
+ <table>
27
+ <tr>
28
+ <th>JSON</th>
29
+ <th>TOON</th>
30
+ </tr>
31
+ <tr>
32
+ <td>
36
33
 
37
34
  ```json
38
35
  {
39
- "users": [
40
- { "id": 1, "name": "Alice", "role": "admin" },
41
- { "id": 2, "name": "Bob", "role": "user" }
36
+ "context": {
37
+ "task": "Our favorite hikes together",
38
+ "location": "Boulder",
39
+ "season": "spring_2025"
40
+ },
41
+ "friends": ["ana", "luis", "sam"],
42
+ "hikes": [
43
+ {
44
+ "id": 1,
45
+ "name": "Blue Lake Trail",
46
+ "distanceKm": 7.5,
47
+ "elevationGain": 320,
48
+ "companion": "ana",
49
+ "wasSunny": true
50
+ },
51
+ {
52
+ "id": 2,
53
+ "name": "Ridge Overlook",
54
+ "distanceKm": 9.2,
55
+ "elevationGain": 540,
56
+ "companion": "luis",
57
+ "wasSunny": false
58
+ },
59
+ {
60
+ "id": 3,
61
+ "name": "Wildflower Loop",
62
+ "distanceKm": 5.1,
63
+ "elevationGain": 180,
64
+ "companion": "sam",
65
+ "wasSunny": true
66
+ }
42
67
  ]
43
68
  }
44
69
  ```
45
70
 
46
- **TOON:**
71
+ </td>
72
+ <td>
47
73
 
74
+ ```toon
75
+ context:
76
+ task: Our favorite hikes together
77
+ location: Boulder
78
+ season: spring_2025
79
+
80
+ friends[3]: ana,luis,sam
81
+
82
+ hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
83
+ 1,Blue Lake Trail,7.5,320,ana,true
84
+ 2,Ridge Overlook,9.2,540,luis,false
85
+ 3,Wildflower Loop,5.1,180,sam,true
48
86
  ```
49
- users[2]{id,name,role}:
50
- 1,Alice,admin
51
- 2,Bob,user
52
- ```
87
+
88
+ </td>
89
+ </tr>
90
+ </table>
53
91
 
54
92
  ## Reference Implementation
55
93
 
@@ -84,6 +122,22 @@ The [tests/fixtures/](./tests/fixtures/) directory contains **language-agnostic
84
122
 
85
123
  See [tests/README.md](./tests/README.md) for detailed fixture format and usage instructions.
86
124
 
125
+ ## Media Type & File Extension
126
+
127
+ TOON defines a provisional media type (see §18.2 of the specification):
128
+
129
+ - **Media type:** `text/toon` (provisional, pending IANA registration)
130
+ - **File extension:** `.toon`
131
+ - **Charset:** Always UTF-8
132
+
133
+ For HTTP usage:
134
+
135
+ ```http
136
+ Content-Type: text/toon
137
+ ```
138
+
139
+ See the full [IANA Considerations section](SPEC.md#18-iana-considerations) for details.
140
+
87
141
  ## Contributing
88
142
 
89
143
  We welcome contributions to improve the specification! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for:
package/SPEC.md CHANGED
@@ -2,9 +2,9 @@
2
2
 
3
3
  ## Token-Oriented Object Notation
4
4
 
5
- **Version:** 2.0
5
+ **Version:** 2.1
6
6
 
7
- **Date:** 2025-11-10
7
+ **Date:** 2025-11-23
8
8
 
9
9
  **Status:** Working Draft
10
10
 
@@ -20,7 +20,7 @@ Token-Oriented Object Notation (TOON) is a line-oriented, indentation-based text
20
20
 
21
21
  ## Status of This Document
22
22
 
23
- This document is a Working Draft v2.0 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
23
+ This document is a Working Draft v2.1 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
24
24
 
25
25
  This specification is stable for implementation but not yet finalized. Breaking changes may occur in future major versions.
26
26
 
@@ -499,7 +499,15 @@ Decoding:
499
499
  For an object appearing as a list item:
500
500
 
501
501
  - Empty object list item: a single "-" at the list-item indentation level.
502
- - First field on the hyphen line:
502
+ - Encoding selection (normative):
503
+ - When an object has **exactly one field** and that field encodes to a tabular array, encoders SHOULD use the compact form with the tabular header on the hyphen line:
504
+ - Tabular array: - key[N<delim?>]{fields}:
505
+ - Followed by tabular rows at depth +1 (relative to the hyphen line).
506
+ - For all other cases (multiple fields, or single non-tabular field), encoders SHOULD emit a bare hyphen on its own line:
507
+ - Bare hyphen: -
508
+ - All fields appear at depth +1 under the hyphen line in encounter order, using normal object field rules (Section 8).
509
+ - When a field is a tabular array, its header appears at depth +1 and its rows at depth +2 (relative to the hyphen line).
510
+ - First field on the hyphen line (legacy encoding, still valid for decoding):
503
511
  - Primitive: - key: value
504
512
  - Primitive array: - key[M<delim?>]: v1<delim>…
505
513
  - Tabular array: - key[N<delim?>]{fields}:
@@ -508,7 +516,7 @@ For an object appearing as a list item:
508
516
  - Followed by list items at depth +1.
509
517
  - Object: - key:
510
518
  - Nested object fields appear at depth +2 (i.e., one deeper than subsequent sibling fields of the same list item).
511
- - Remaining fields of the same object appear at depth +1 under the hyphen line in encounter order, using normal object field rules.
519
+ - Remaining fields of the same object appear at depth +1 under the hyphen line in encounter order, using normal object field rules.
512
520
 
513
521
  Decoding:
514
522
  - The first field is parsed from the hyphen line. If it is a nested object (- key:), nested fields are at +2 relative to the hyphen line; subsequent fields of the same list item are at +1.
@@ -894,7 +902,11 @@ This specification does not request IANA registration at this time, as the forma
894
902
 
895
903
  ### 18.2 Provisional Media Type
896
904
 
897
- The following provisional media type designation is RECOMMENDED for experimental implementations:
905
+ Until IANA registration is completed, implementations SHOULD use:
906
+ - Media type: `text/toon`
907
+ - File extension: `.toon`
908
+
909
+ Full designation details:
898
910
 
899
911
  Type name: text
900
912
 
@@ -988,12 +1000,15 @@ items[2]:
988
1000
  Nested tabular inside a list item:
989
1001
  ```
990
1002
  items[1]:
991
- - users[2]{id,name}:
992
- 1,Ada
993
- 2,Bob
1003
+ -
1004
+ users[2]{id,name}:
1005
+ 1,Ada
1006
+ 2,Bob
994
1007
  status: active
995
1008
  ```
996
1009
 
1010
+ Note: Encoders use this format (bare hyphen with all fields indented) for objects with multiple fields. Older encodings may place the first field on the hyphen line; both are valid for decoders.
1011
+
997
1012
  Delimiter variations:
998
1013
  ```
999
1014
  items[2 ]{sku name qty price}:
@@ -1218,52 +1233,39 @@ Note: Host-type normalization tests (e.g., BigInt, Date, Set, Map) are language-
1218
1233
 
1219
1234
  ## Appendix D: Document Changelog (Informative)
1220
1235
 
1236
+ This appendix summarizes major changes between spec versions. For the complete changelog, see [`CHANGELOG.md`](./CHANGELOG.md) in the specification repository.
1237
+
1238
+ ### v2.1 (2025-11-23)
1239
+
1240
+ - Tightened canonical encoding for objects as list items (§10): bare `-` for multi-field objects, compact `- key[N]{fields}:` only for single-field tabular arrays, to improve visual consistency and LLM readability.
1241
+
1221
1242
  ### v2.0 (2025-11-10)
1222
1243
 
1223
- - Breaking change: Length marker (`#`) prefix in array headers has been completely removed from the specification.
1224
- - The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only.
1225
- - Encoders MUST NOT emit `[#N]` format.
1226
- - Decoders MUST NOT accept `[#N]` format (breaking change from v1.5).
1227
- - Removed all references to length marker from terminology, grammar, conformance requirements, and parsing helpers.
1244
+ - Removed `[#N]` length-marker syntax from array headers; `[N]` is now the only valid form.
1228
1245
 
1229
1246
  ### v1.5 (2025-11-08)
1230
1247
 
1231
- - Added optional key folding for encoders: `keyFolding='safe'` mode with `flattenDepth` control (§13.4).
1232
- - Added optional path expansion for decoders: `expandPaths='safe'` mode with conflict resolution tied to existing `strict` option (§13.4).
1233
- - Defined safe-mode requirements for folding: IdentifierSegment validation, no path separator in segments, collision avoidance, no quoting required (§7.3, §13.4).
1234
- - Specified deep-merge semantics for expansion: recursive merge for objects; conflict policy (error in strict mode, LWW when strict=false) for non-objects (§13.4).
1235
- - Added strict-mode error category for path expansion conflicts (§14.5).
1236
- - Both features default to OFF; fully backward-compatible.
1248
+ - Added optional key folding (`keyFolding="safe"`) and path expansion (`expandPaths="safe"`) with deep-merge semantics and strict-mode conflict handling (§13.4, §14.5).
1237
1249
 
1238
1250
  ### v1.4 (2025-11-05)
1239
1251
 
1240
- - Removed JavaScript-specific normalization details; replaced with language-agnostic requirements (Section 3).
1241
- - Defined canonical number format for encoders and decoder acceptance rules (Section 2).
1242
- - Added Appendix G with host-type normalization examples for Go, JavaScript, Python, and Rust.
1243
- - Clarified non-strict mode tab handling as implementation-defined (Section 12).
1244
- - Expanded regex notation for cross-language clarity (Section 7.3).
1252
+ - Generalized normalization and numeric canonicalization rules, and added host-type normalization guidance (Appendix G).
1245
1253
 
1246
1254
  ### v1.3 (2025-10-31)
1247
1255
 
1248
- - Added numeric precision requirements: JavaScript implementations SHOULD use Number.toString() precision (15-17 digits), all implementations MUST preserve round-trip fidelity (Section 2).
1249
- - Added RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (Section 6).
1256
+ - Added numeric precision guidance and ABNF core rules for headers and keys (§2, §6).
1250
1257
 
1251
1258
  ### v1.2 (2025-10-29)
1252
1259
 
1253
- - Clarified delimiter scoping behavior between array headers.
1254
- - Tightened strict-mode indentation requirements: leading spaces MUST be exact multiples of indentSize; tabs in indentation MUST error.
1255
- - Defined blank-line and trailing-newline decoding behavior with explicit skipping rules outside arrays.
1256
- - Clarified hyphen-based quoting: "-" or any string starting with "-" MUST be quoted.
1257
- - Clarified BigInt normalization: values outside safe integer range are converted to quoted decimal strings.
1258
- - Clarified row/key disambiguation: uses first unquoted delimiter vs colon position.
1260
+ - Tightened delimiter scoping, indentation, blank-line handling, hyphen-based quoting, BigInt normalization, and row/key disambiguation rules (§2, §9, §11-§12).
1259
1261
 
1260
1262
  ### v1.1 (2025-10-29)
1261
1263
 
1262
- Added strict-mode rules, delimiter-aware parsing, and decoder options (indent, strict).
1264
+ - Introduced strict-mode validation, delimiter-aware parsing, and decoder options (indent, strict).
1263
1265
 
1264
1266
  ### v1.0 (2025-10-28)
1265
1267
 
1266
- Initial encoding, normalization, and conformance rules.
1268
+ - Initial specification: encoding normalization, decoding interpretation, and conformance requirements.
1267
1269
 
1268
1270
  ## Appendix E: Acknowledgments and License
1269
1271
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@toon-format/spec",
3
3
  "type": "module",
4
- "version": "2.0.0",
4
+ "version": "2.1.0",
5
5
  "packageManager": "pnpm@10.19.0",
6
6
  "description": "Official specification for Token-Oriented Object Notation (TOON)",
7
7
  "author": "Johann Schopplich <hello@johannschopplich.com>",
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "1.4",
2
+ "version": "2.1",
3
3
  "category": "decode",
4
4
  "description": "Nested and mixed array decoding - list format, arrays of arrays, root arrays, mixed types",
5
5
  "tests": [
@@ -52,7 +52,7 @@
52
52
  "specSection": "9.4"
53
53
  },
54
54
  {
55
- "name": "parses nested tabular arrays as first field on hyphen line",
55
+ "name": "parses nested tabular arrays as first field on hyphen line (legacy)",
56
56
  "input": "items[1]:\n - users[2]{id,name}:\n 1,Ada\n 2,Bob\n status: active",
57
57
  "expected": {
58
58
  "items": [
@@ -65,14 +65,33 @@
65
65
  }
66
66
  ]
67
67
  },
68
- "specSection": "10"
68
+ "specSection": "10",
69
+ "note": "Still valid for backward compatibility"
70
+ },
71
+ {
72
+ "name": "parses nested tabular arrays in list items with bare hyphen",
73
+ "input": "items[1]:\n -\n users[2]{id,name}:\n 1,Ada\n 2,Bob\n status: active",
74
+ "expected": {
75
+ "items": [
76
+ {
77
+ "users": [
78
+ { "id": 1, "name": "Ada" },
79
+ { "id": 2, "name": "Bob" }
80
+ ],
81
+ "status": "active"
82
+ }
83
+ ]
84
+ },
85
+ "specSection": "10",
86
+ "minSpecVersion": "2.1",
87
+ "note": "Canonical v2.1+ encoding (bare hyphen with all fields indented)"
69
88
  },
70
89
  {
71
90
  "name": "parses objects containing arrays (including empty arrays) in list format",
72
- "input": "items[1]:\n - name: test\n data[0]:",
91
+ "input": "items[1]:\n - name: Ada\n data[0]:",
73
92
  "expected": {
74
93
  "items": [
75
- { "name": "test", "data": [] }
94
+ { "name": "Ada", "data": [] }
76
95
  ]
77
96
  },
78
97
  "specSection": "9.4"
@@ -120,35 +139,41 @@
120
139
  "specSection": "9.2"
121
140
  },
122
141
  {
123
- "name": "parses root arrays of primitives (inline)",
142
+ "name": "parses root-level primitive array inline",
124
143
  "input": "[5]: x,y,\"true\",true,10",
125
144
  "expected": ["x", "y", "true", true, 10],
126
145
  "specSection": "9.1"
127
146
  },
128
147
  {
129
- "name": "parses root arrays of uniform objects in tabular format",
148
+ "name": "parses root-level array of uniform objects in tabular format",
130
149
  "input": "[2]{id}:\n 1\n 2",
131
150
  "expected": [{ "id": 1 }, { "id": 2 }],
132
151
  "specSection": "9.3"
133
152
  },
134
153
  {
135
- "name": "parses root arrays of non-uniform objects in list format",
154
+ "name": "parses root-level array of non-uniform objects in list format",
136
155
  "input": "[2]:\n - id: 1\n - id: 2\n name: Ada",
137
156
  "expected": [{ "id": 1 }, { "id": 2, "name": "Ada" }],
138
157
  "specSection": "9.4"
139
158
  },
140
159
  {
141
- "name": "parses empty root arrays",
142
- "input": "[0]:",
143
- "expected": [],
144
- "specSection": "9.1"
160
+ "name": "parses root-level array mixing primitive, object, and array of objects in list format",
161
+ "input": "[3]:\n - summary\n - id: 1\n name: Ada\n - [2]:\n - id: 2\n - status: draft",
162
+ "expected": ["summary", { "id": 1, "name": "Ada" }, [{ "id": 2 }, { "status": "draft" }]],
163
+ "specSection": "9.4"
145
164
  },
146
165
  {
147
- "name": "parses root arrays of arrays",
166
+ "name": "parses root-level array of arrays",
148
167
  "input": "[2]:\n - [2]: 1,2\n - [0]:",
149
168
  "expected": [[1, 2], []],
150
169
  "specSection": "9.2"
151
170
  },
171
+ {
172
+ "name": "parses empty root-level array",
173
+ "input": "[0]:",
174
+ "expected": [],
175
+ "specSection": "9.1"
176
+ },
152
177
  {
153
178
  "name": "parses complex mixed object with arrays and nested objects",
154
179
  "input": "user:\n id: 123\n name: Ada\n tags[2]: reading,gaming\n active: true\n prefs[0]:",
@@ -164,7 +189,7 @@
164
189
  "specSection": "8"
165
190
  },
166
191
  {
167
- "name": "parses arrays mixing primitives, objects and strings (list format)",
192
+ "name": "parses arrays mixing primitives, objects, and strings in list format",
168
193
  "input": "items[3]:\n - 1\n - a: 1\n - text",
169
194
  "expected": {
170
195
  "items": [1, { "a": 1 }, "text"]
@@ -59,7 +59,7 @@
59
59
  "specSection": "9.3"
60
60
  },
61
61
  {
62
- "name": "unquoted colon terminates tabular rows and starts key-value pair",
62
+ "name": "treats unquoted colon as terminator for tabular rows and start of key-value pair",
63
63
  "input": "items[2]{id,name}:\n 1,Alice\n 2,Bob\ncount: 2",
64
64
  "expected": {
65
65
  "items": [
@@ -66,7 +66,7 @@
66
66
  "specSection": "11"
67
67
  },
68
68
  {
69
- "name": "nested arrays inside list items default to comma delimiter",
69
+ "name": "parses nested arrays inside list items with default comma delimiter",
70
70
  "input": "items[1\t]:\n - tags[3]: a,b,c",
71
71
  "expected": {
72
72
  "items": [{ "tags": ["a", "b", "c"] }]
@@ -75,7 +75,7 @@
75
75
  "note": "Parent uses tab, nested defaults to comma"
76
76
  },
77
77
  {
78
- "name": "nested arrays inside list items default to comma with pipe parent",
78
+ "name": "parses nested arrays inside list items with default comma delimiter when parent uses pipe",
79
79
  "input": "items[1|]:\n - tags[3]: a,b,c",
80
80
  "expected": {
81
81
  "items": [{ "tags": ["a", "b", "c"] }]
@@ -83,25 +83,25 @@
83
83
  "specSection": "11"
84
84
  },
85
85
  {
86
- "name": "parses root arrays with tab delimiter",
86
+ "name": "parses root-level array with tab delimiter",
87
87
  "input": "[3\t]: x\ty\tz",
88
88
  "expected": ["x", "y", "z"],
89
89
  "specSection": "11"
90
90
  },
91
91
  {
92
- "name": "parses root arrays with pipe delimiter",
92
+ "name": "parses root-level array with pipe delimiter",
93
93
  "input": "[3|]: x|y|z",
94
94
  "expected": ["x", "y", "z"],
95
95
  "specSection": "11"
96
96
  },
97
97
  {
98
- "name": "parses root arrays of objects with tab delimiter",
98
+ "name": "parses root-level array of objects with tab delimiter",
99
99
  "input": "[2\t]{id}:\n 1\n 2",
100
100
  "expected": [{ "id": 1 }, { "id": 2 }],
101
101
  "specSection": "11"
102
102
  },
103
103
  {
104
- "name": "parses root arrays of objects with pipe delimiter",
104
+ "name": "parses root-level array of objects with pipe delimiter",
105
105
  "input": "[2|]{id}:\n 1\n 2",
106
106
  "expected": [{ "id": 1 }, { "id": 2 }],
107
107
  "specSection": "11"
@@ -4,7 +4,7 @@
4
4
  "description": "Strict mode indentation validation - non-multiple indentation, tab characters, custom indent sizes",
5
5
  "tests": [
6
6
  {
7
- "name": "throws when object field has non-multiple indentation (3 spaces with indent=2)",
7
+ "name": "throws on object field with non-multiple indentation (3 spaces with indent=2)",
8
8
  "input": "a:\n b: 1",
9
9
  "expected": null,
10
10
  "shouldError": true,
@@ -15,7 +15,7 @@
15
15
  "specSection": "14.3"
16
16
  },
17
17
  {
18
- "name": "throws when list item has non-multiple indentation (3 spaces with indent=2)",
18
+ "name": "throws on list item with non-multiple indentation (3 spaces with indent=2)",
19
19
  "input": "items[2]:\n - id: 1\n - id: 2",
20
20
  "expected": null,
21
21
  "shouldError": true,
@@ -26,7 +26,7 @@
26
26
  "specSection": "14.3"
27
27
  },
28
28
  {
29
- "name": "throws with custom indent size when non-multiple (3 spaces with indent=4)",
29
+ "name": "throws on non-multiple indentation with custom indent=4 (3 spaces)",
30
30
  "input": "a:\n b: 1",
31
31
  "expected": null,
32
32
  "shouldError": true,
@@ -51,7 +51,7 @@
51
51
  "specSection": "12"
52
52
  },
53
53
  {
54
- "name": "throws when tab character used in indentation",
54
+ "name": "throws on tab character used in indentation",
55
55
  "input": "a:\n\tb: 1",
56
56
  "expected": null,
57
57
  "shouldError": true,
@@ -61,7 +61,7 @@
61
61
  "specSection": "14.3"
62
62
  },
63
63
  {
64
- "name": "throws when mixed tabs and spaces in indentation",
64
+ "name": "throws on mixed tabs and spaces in indentation",
65
65
  "input": "a:\n \tb: 1",
66
66
  "expected": null,
67
67
  "shouldError": true,
@@ -71,7 +71,7 @@
71
71
  "specSection": "14.3"
72
72
  },
73
73
  {
74
- "name": "throws when tab at start of line",
74
+ "name": "throws on tab at start of line",
75
75
  "input": "\ta: 1",
76
76
  "expected": null,
77
77
  "shouldError": true,
@@ -144,7 +144,7 @@
144
144
  "specSection": "12"
145
145
  },
146
146
  {
147
- "name": "empty lines do not trigger validation errors",
147
+ "name": "parses empty lines without validation errors",
148
148
  "input": "a: 1\n\nb: 2",
149
149
  "expected": {
150
150
  "a": 1,
@@ -156,7 +156,7 @@
156
156
  "specSection": "12"
157
157
  },
158
158
  {
159
- "name": "root-level content (0 indentation) is always valid",
159
+ "name": "parses root-level content (0 indentation) as always valid",
160
160
  "input": "a: 1\nb: 2\nc: 3",
161
161
  "expected": {
162
162
  "a": 1,
@@ -169,7 +169,7 @@
169
169
  "specSection": "12"
170
170
  },
171
171
  {
172
- "name": "lines with only spaces are not validated if empty",
172
+ "name": "parses lines with only spaces without validation if empty",
173
173
  "input": "a: 1\n \nb: 2",
174
174
  "expected": {
175
175
  "a": 1,
@@ -4,7 +4,7 @@
4
4
  "description": "Root form detection - empty document, single primitive, multiple primitives",
5
5
  "tests": [
6
6
  {
7
- "name": "empty document decodes to empty object",
7
+ "name": "parses empty document as empty object",
8
8
  "input": "",
9
9
  "expected": {},
10
10
  "options": {
@@ -18,14 +18,14 @@
18
18
  "specSection": "14.1"
19
19
  },
20
20
  {
21
- "name": "throws when tabular row value count does not match header field count",
21
+ "name": "throws on tabular row value count mismatch with header field count",
22
22
  "input": "items[2]{id,name}:\n 1,Ada\n 2",
23
23
  "expected": null,
24
24
  "shouldError": true,
25
25
  "specSection": "14.1"
26
26
  },
27
27
  {
28
- "name": "throws when tabular row count does not match header length",
28
+ "name": "throws on tabular row count mismatch with header length",
29
29
  "input": "[1]{id}:\n 1\n 2",
30
30
  "expected": null,
31
31
  "shouldError": true,
@@ -49,7 +49,7 @@
49
49
  "specSection": "12"
50
50
  },
51
51
  {
52
- "name": "empty tokens decode to empty string",
52
+ "name": "parses empty tokens as empty string",
53
53
  "input": "items[3]: a,,c",
54
54
  "expected": {
55
55
  "items": ["a", "", "c"]
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "1.4",
2
+ "version": "2.1",
3
3
  "category": "encode",
4
4
  "description": "Nested and mixed array encoding - arrays of arrays, mixed type arrays, root arrays",
5
5
  "tests": [
@@ -50,14 +50,16 @@
50
50
  {
51
51
  "name": "encodes root-level array of non-uniform objects in list format",
52
52
  "input": [{ "id": 1 }, { "id": 2, "name": "Ada" }],
53
- "expected": "[2]:\n - id: 1\n - id: 2\n name: Ada",
54
- "specSection": "9.4"
53
+ "expected": "[2]:\n -\n id: 1\n -\n id: 2\n name: Ada",
54
+ "specSection": "9.4",
55
+ "minSpecVersion": "2.1"
55
56
  },
56
57
  {
57
- "name": "encodes empty root-level array",
58
- "input": [],
59
- "expected": "[0]:",
60
- "specSection": "9.1"
58
+ "name": "encodes root-level array mixing primitive, object, and array of objects in list format",
59
+ "input": ["summary", { "id": 1, "name": "Ada" }, [{ "id": 2 }, { "status": "draft" }]],
60
+ "expected": "[3]:\n - summary\n -\n id: 1\n name: Ada\n - [2]:\n -\n id: 2\n -\n status: draft",
61
+ "specSection": "9.4",
62
+ "minSpecVersion": "2.1"
61
63
  },
62
64
  {
63
65
  "name": "encodes root-level arrays of arrays",
@@ -65,6 +67,12 @@
65
67
  "expected": "[2]:\n - [2]: 1,2\n - [0]:",
66
68
  "specSection": "9.2"
67
69
  },
70
+ {
71
+ "name": "encodes empty root-level array",
72
+ "input": [],
73
+ "expected": "[0]:",
74
+ "specSection": "9.1"
75
+ },
68
76
  {
69
77
  "name": "encodes complex nested structure",
70
78
  "input": {
@@ -84,16 +92,18 @@
84
92
  "input": {
85
93
  "items": [1, { "a": 1 }, "text"]
86
94
  },
87
- "expected": "items[3]:\n - 1\n - a: 1\n - text",
88
- "specSection": "9.4"
95
+ "expected": "items[3]:\n - 1\n -\n a: 1\n - text",
96
+ "specSection": "9.4",
97
+ "minSpecVersion": "2.1"
89
98
  },
90
99
  {
91
100
  "name": "uses list format for arrays mixing objects and arrays",
92
101
  "input": {
93
102
  "items": [{ "a": 1 }, [1, 2]]
94
103
  },
95
- "expected": "items[2]:\n - a: 1\n - [2]: 1,2",
96
- "specSection": "9.4"
104
+ "expected": "items[2]:\n -\n a: 1\n - [2]: 1,2",
105
+ "specSection": "9.4",
106
+ "minSpecVersion": "2.1"
97
107
  }
98
108
  ]
99
109
  }
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "1.4",
2
+ "version": "2.1",
3
3
  "category": "encode",
4
4
  "description": "Arrays of objects encoding - list format for non-uniform objects and complex structures",
5
5
  "tests": [
@@ -11,8 +11,9 @@
11
11
  { "id": 2, "name": "Second", "extra": true }
12
12
  ]
13
13
  },
14
- "expected": "items[2]:\n - id: 1\n name: First\n - id: 2\n name: Second\n extra: true",
15
- "specSection": "9.4"
14
+ "expected": "items[2]:\n -\n id: 1\n name: First\n -\n id: 2\n name: Second\n extra: true",
15
+ "specSection": "9.4",
16
+ "minSpecVersion": "2.1"
16
17
  },
17
18
  {
18
19
  "name": "uses list format for objects with nested values",
@@ -21,24 +22,27 @@
21
22
  { "id": 1, "nested": { "x": 1 } }
22
23
  ]
23
24
  },
24
- "expected": "items[1]:\n - id: 1\n nested:\n x: 1",
25
- "specSection": "9.4"
25
+ "expected": "items[1]:\n -\n id: 1\n nested:\n x: 1",
26
+ "specSection": "9.4",
27
+ "minSpecVersion": "2.1"
26
28
  },
27
29
  {
28
30
  "name": "preserves field order in list items - array first",
29
31
  "input": {
30
- "items": [{ "nums": [1, 2, 3], "name": "test" }]
32
+ "items": [{ "nums": [1, 2, 3], "name": "Ada" }]
31
33
  },
32
- "expected": "items[1]:\n - nums[3]: 1,2,3\n name: test",
33
- "specSection": "10"
34
+ "expected": "items[1]:\n -\n nums[3]: 1,2,3\n name: Ada",
35
+ "specSection": "10",
36
+ "minSpecVersion": "2.1"
34
37
  },
35
38
  {
36
39
  "name": "preserves field order in list items - primitive first",
37
40
  "input": {
38
- "items": [{ "name": "test", "nums": [1, 2, 3] }]
41
+ "items": [{ "name": "Ada", "nums": [1, 2, 3] }]
39
42
  },
40
- "expected": "items[1]:\n - name: test\n nums[3]: 1,2,3",
41
- "specSection": "10"
43
+ "expected": "items[1]:\n -\n name: Ada\n nums[3]: 1,2,3",
44
+ "specSection": "10",
45
+ "minSpecVersion": "2.1"
42
46
  },
43
47
  {
44
48
  "name": "uses list format for objects containing arrays of arrays",
@@ -47,8 +51,9 @@
47
51
  { "matrix": [[1, 2], [3, 4]], "name": "grid" }
48
52
  ]
49
53
  },
50
- "expected": "items[1]:\n - matrix[2]:\n - [2]: 1,2\n - [2]: 3,4\n name: grid",
51
- "specSection": "10"
54
+ "expected": "items[1]:\n -\n matrix[2]:\n - [2]: 1,2\n - [2]: 3,4\n name: grid",
55
+ "specSection": "10",
56
+ "minSpecVersion": "2.1"
52
57
  },
53
58
  {
54
59
  "name": "uses tabular format for nested uniform object arrays",
@@ -57,8 +62,10 @@
57
62
  { "users": [{ "id": 1, "name": "Ada" }, { "id": 2, "name": "Bob" }], "status": "active" }
58
63
  ]
59
64
  },
60
- "expected": "items[1]:\n - users[2]{id,name}:\n 1,Ada\n 2,Bob\n status: active",
61
- "specSection": "10"
65
+ "expected": "items[1]:\n -\n users[2]{id,name}:\n 1,Ada\n 2,Bob\n status: active",
66
+ "specSection": "10",
67
+ "minSpecVersion": "2.1",
68
+ "note": "Bare hyphen format for multi-field objects with tabular arrays"
62
69
  },
63
70
  {
64
71
  "name": "uses list format for nested object arrays with mismatched keys",
@@ -67,50 +74,67 @@
67
74
  { "users": [{ "id": 1, "name": "Ada" }, { "id": 2 }], "status": "active" }
68
75
  ]
69
76
  },
70
- "expected": "items[1]:\n - users[2]:\n - id: 1\n name: Ada\n - id: 2\n status: active",
71
- "specSection": "10"
77
+ "expected": "items[1]:\n -\n users[2]:\n -\n id: 1\n name: Ada\n -\n id: 2\n status: active",
78
+ "specSection": "10",
79
+ "minSpecVersion": "2.1"
72
80
  },
73
81
  {
74
82
  "name": "uses list format for objects with multiple array fields",
75
83
  "input": {
76
84
  "items": [{ "nums": [1, 2], "tags": ["a", "b"], "name": "test" }]
77
85
  },
78
- "expected": "items[1]:\n - nums[2]: 1,2\n tags[2]: a,b\n name: test",
79
- "specSection": "10"
86
+ "expected": "items[1]:\n -\n nums[2]: 1,2\n tags[2]: a,b\n name: test",
87
+ "specSection": "10",
88
+ "minSpecVersion": "2.1"
80
89
  },
81
90
  {
82
91
  "name": "uses list format for objects with only array fields",
83
92
  "input": {
84
93
  "items": [{ "nums": [1, 2, 3], "tags": ["a", "b"] }]
85
94
  },
86
- "expected": "items[1]:\n - nums[3]: 1,2,3\n tags[2]: a,b",
87
- "specSection": "10"
95
+ "expected": "items[1]:\n -\n nums[3]: 1,2,3\n tags[2]: a,b",
96
+ "specSection": "10",
97
+ "minSpecVersion": "2.1"
88
98
  },
89
99
  {
90
100
  "name": "encodes objects with empty arrays in list format",
91
101
  "input": {
92
102
  "items": [
93
- { "name": "test", "data": [] }
103
+ { "name": "Ada", "data": [] }
94
104
  ]
95
105
  },
96
- "expected": "items[1]:\n - name: test\n data[0]:",
97
- "specSection": "10"
106
+ "expected": "items[1]:\n -\n name: Ada\n data[0]:",
107
+ "specSection": "10",
108
+ "minSpecVersion": "2.1"
98
109
  },
99
110
  {
100
- "name": "places first field of nested tabular arrays on hyphen line",
111
+ "name": "uses bare hyphen for multi-field list-item objects with tabular arrays",
101
112
  "input": {
102
113
  "items": [{ "users": [{ "id": 1 }, { "id": 2 }], "note": "x" }]
103
114
  },
104
- "expected": "items[1]:\n - users[2]{id}:\n 1\n 2\n note: x",
105
- "specSection": "10"
115
+ "expected": "items[1]:\n -\n users[2]{id}:\n 1\n 2\n note: x",
116
+ "specSection": "10",
117
+ "minSpecVersion": "2.1",
118
+ "note": "Multi-field objects use bare hyphen with all fields indented"
119
+ },
120
+ {
121
+ "name": "uses compact form for single-field list-item tabular arrays",
122
+ "input": {
123
+ "items": [{ "users": [{ "id": 1, "name": "Ada" }, { "id": 2, "name": "Bob" }] }]
124
+ },
125
+ "expected": "items[1]:\n - users[2]{id,name}:\n 1,Ada\n 2,Bob",
126
+ "specSection": "10",
127
+ "minSpecVersion": "2.1",
128
+ "note": "Single-field objects with tabular arrays use compact form on hyphen line"
106
129
  },
107
130
  {
108
131
  "name": "places empty arrays on hyphen line when first",
109
132
  "input": {
110
133
  "items": [{ "data": [], "name": "x" }]
111
134
  },
112
- "expected": "items[1]:\n - data[0]:\n name: x",
113
- "specSection": "10"
135
+ "expected": "items[1]:\n -\n data[0]:\n name: x",
136
+ "specSection": "10",
137
+ "minSpecVersion": "2.1"
114
138
  },
115
139
  {
116
140
  "name": "uses field order from first object for tabular headers",
@@ -124,15 +148,16 @@
124
148
  "specSection": "9.3"
125
149
  },
126
150
  {
127
- "name": "uses list format when one object has nested column",
151
+ "name": "uses list format when one object has nested field",
128
152
  "input": {
129
153
  "items": [
130
154
  { "id": 1, "data": "string" },
131
155
  { "id": 2, "data": { "nested": true } }
132
156
  ]
133
157
  },
134
- "expected": "items[2]:\n - id: 1\n data: string\n - id: 2\n data:\n nested: true",
135
- "specSection": "9.4"
158
+ "expected": "items[2]:\n -\n id: 1\n data: string\n -\n id: 2\n data:\n nested: true",
159
+ "specSection": "9.4",
160
+ "minSpecVersion": "2.1"
136
161
  }
137
162
  ]
138
163
  }
@@ -4,7 +4,7 @@
4
4
  "description": "Tabular array encoding - arrays of uniform objects with primitive values",
5
5
  "tests": [
6
6
  {
7
- "name": "encodes arrays of similar objects in tabular format",
7
+ "name": "encodes arrays of uniform objects in tabular format",
8
8
  "input": {
9
9
  "items": [
10
10
  { "sku": "A1", "qty": 2, "price": 9.99 },
@@ -87,7 +87,7 @@
87
87
  "specSection": "11"
88
88
  },
89
89
  {
90
- "name": "encodes root arrays with tab delimiter",
90
+ "name": "encodes root-level array with tab delimiter",
91
91
  "input": ["x", "y", "z"],
92
92
  "expected": "[3\t]: x\ty\tz",
93
93
  "options": {
@@ -96,7 +96,7 @@
96
96
  "specSection": "11"
97
97
  },
98
98
  {
99
- "name": "encodes root arrays with pipe delimiter",
99
+ "name": "encodes root-level array with pipe delimiter",
100
100
  "input": ["x", "y", "z"],
101
101
  "expected": "[3|]: x|y|z",
102
102
  "options": {
@@ -105,7 +105,7 @@
105
105
  "specSection": "11"
106
106
  },
107
107
  {
108
- "name": "encodes root arrays of objects with tab delimiter",
108
+ "name": "encodes root-level array of objects with tab delimiter",
109
109
  "input": [{ "id": 1 }, { "id": 2 }],
110
110
  "expected": "[2\t]{id}:\n 1\n 2",
111
111
  "options": {
@@ -114,7 +114,7 @@
114
114
  "specSection": "11"
115
115
  },
116
116
  {
117
- "name": "encodes root arrays of objects with pipe delimiter",
117
+ "name": "encodes root-level array of objects with pipe delimiter",
118
118
  "input": [{ "id": 1 }, { "id": 2 }],
119
119
  "expected": "[2|]{id}:\n 1\n 2",
120
120
  "options": {