@toon-format/spec 1.5.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,28 @@ All notable changes to the TOON specification will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [2.0] - 2025-11-10
9
+
10
+ ### Breaking Changes
11
+
12
+ - **Removed:** Length marker (`#`) prefix in array headers has been completely removed from the specification
13
+ - The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only
14
+ - Encoders MUST NOT emit `[#N]` format
15
+ - Decoders MUST NOT accept `[#N]` format (breaking change from v1.5)
16
+
17
+ ### Removed
18
+
19
+ - All references to length marker from terminology (§1.4), header syntax (§6), ABNF grammar, conformance requirements (§13.2), and parsing helpers (Appendix B)
20
+ - `lengthMarker` encoder option removed from all implementations
21
+ - Length marker test fixtures removed
22
+
23
+ ### Migration from v1.5
24
+
25
+ - Update decoder implementations to reject `[#N]` syntax
26
+ - Convert any existing `.toon` files using `[#N]` format to `[N]` format
27
+ - Remove `lengthMarker` option from encoder configurations
28
+ - Remove `--length-marker` CLI flags if present
29
+
8
30
  ## [1.5] - 2025-11-08
9
31
 
10
32
  ### Added
package/CONTRIBUTING.md CHANGED
@@ -32,7 +32,7 @@ New data type normalization rules
32
32
  Example: Handling Map or Set differently
33
33
 
34
34
  Changing array header syntax
35
- Example: Optional length markers becoming required
35
+ Example: Making field lists mandatory for all arrays
36
36
  ```
37
37
 
38
38
  ### No – Direct PR or Issue First
package/README.md CHANGED
@@ -1,7 +1,7 @@
1
1
  # TOON Format Specification
2
2
 
3
- [![SPEC v1.5](https://img.shields.io/badge/spec-v1.5-lightgrey)](./SPEC.md)
4
- [![Tests](https://img.shields.io/badge/tests-323-green)](./tests/fixtures/)
3
+ [![SPEC v2.0](https://img.shields.io/badge/spec-v2.0-lightgrey)](./SPEC.md)
4
+ [![Tests](https://img.shields.io/badge/tests-340-green)](./tests/fixtures/)
5
5
  [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
6
6
 
7
7
  This repository contains the official specification for **Token-Oriented Object Notation (TOON)**, a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.
@@ -10,24 +10,12 @@ This repository contains the official specification for **Token-Oriented Object
10
10
 
11
11
  [→ Read the full specification (SPEC.md)](./SPEC.md)
12
12
 
13
- - **Version:** 1.5 (2025-11-10)
13
+ - **Version:** 2.0 (2025-11-10)
14
14
  - **Status:** Working Draft
15
15
  - **License:** MIT
16
16
 
17
17
  The specification includes complete grammar (ABNF), encoding rules, validation requirements, and conformance criteria.
18
18
 
19
- ### New in v1.5
20
-
21
- - **Key Folding** (encode): Collapse nested single-key objects into compact dotted paths
22
- - `{"a": {"b": {"c": 1}}}` → `a.b.c: 1`
23
- - Opt-in via `keyFolding="safe"` with `flattenDepth` control
24
- - **Path Expansion** (decode): Expand dotted keys back to nested objects
25
- - `a.b.c: 1` → `{"a": {"b": {"c": 1}}}`
26
- - Opt-in via `expandPaths="safe"` with deep-merge semantics
27
-
28
- > [!NOTE]
29
- > Both features are opt-in to maintain backward compatibility.
30
-
31
19
  ## What is TOON?
32
20
 
33
21
  **Token-Oriented Object Notation** is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage. It's intended for LLM input, not output.
package/SPEC.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  ## Token-Oriented Object Notation
4
4
 
5
- **Version:** 1.5
5
+ **Version:** 2.0
6
6
 
7
7
  **Date:** 2025-11-10
8
8
 
@@ -20,9 +20,9 @@ Token-Oriented Object Notation (TOON) is a line-oriented, indentation-based text
20
20
 
21
21
  ## Status of This Document
22
22
 
23
- This document is a Working Draft v1.4 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
23
+ This document is a Working Draft v2.0 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
24
24
 
25
- This specification is stable for implementation but not yet finalized. Breaking changes are unlikely but possible before v2.0.
25
+ This specification is stable for implementation but not yet finalized. Breaking changes may occur in future major versions.
26
26
 
27
27
  ## Normative References
28
28
 
@@ -165,7 +165,6 @@ Implementations that fail to conform to any MUST or REQUIRED level requirement a
165
165
  - Header: The bracketed declaration for arrays, optionally followed by a field list, and terminating with a colon; e.g., key[3]: or items[2]{a,b}:.
166
166
  - Field list: Brace-enclosed, delimiter-separated list of field names for tabular arrays: {f1<delim>f2}.
167
167
  - List item: A line beginning with "- " at a given depth representing an element in an expanded array.
168
- - Length marker: Optional "#" prefix for array lengths in headers, e.g., [#3]. Decoders MUST accept and ignore it semantically.
169
168
 
170
169
  ### 1.5 Delimiter Terms
171
170
 
@@ -294,13 +293,12 @@ TOON is a deterministic, line-oriented, indentation-based notation.
294
293
  Array headers declare length and active delimiter, and optionally field names.
295
294
 
296
295
  General forms:
297
- - Root header (no key): [<marker?>N<delim?>]:
298
- - With key: key[<marker?>N<delim?>]:
299
- - Tabular fields: key[<marker?>N<delim?>]{field1<delim>field2<delim>…}:
296
+ - Root header (no key): [N<delim?>]:
297
+ - With key: key[N<delim?>]:
298
+ - Tabular fields: key[N<delim?>]{field1<delim>field2<delim>…}:
300
299
 
301
300
  Where:
302
301
  - N is the non-negative integer length.
303
- - <marker?> is optional "#"; decoders MUST accept and ignore it semantically.
304
302
  - <delim?> is:
305
303
  - absent for comma (","),
306
304
  - HTAB (U+0009) for tab,
@@ -329,7 +327,7 @@ LF = %x0A ; line feed
329
327
  SP = %x20 ; space
330
328
 
331
329
  ; Header syntax
332
- bracket-seg = "[" [ "#" ] 1*DIGIT [ delimsym ] "]"
330
+ bracket-seg = "[" 1*DIGIT [ delimsym ] "]"
333
331
  delimsym = HTAB / "|"
334
332
  ; Field names are keys (quoted/unquoted) separated by the active delimiter
335
333
  fields-seg = "{" fieldname *( delim fieldname ) "}"
@@ -592,7 +590,6 @@ Options:
592
590
  - Encoder options:
593
591
  - indent (default: 2 spaces)
594
592
  - delimiter (document delimiter; default: comma; alternatives: tab, pipe)
595
- - lengthMarker (default: disabled)
596
593
  - keyFolding (default: `"off"`; alternatives: `"safe"`)
597
594
  - flattenDepth (default: Infinity when keyFolding is `"safe"`; non-negative integer ≥ 0; values 0 or 1 have no practical folding effect)
598
595
  - Decoder options:
@@ -700,7 +697,6 @@ Conforming decoders MUST:
700
697
  - [ ] Unescape quoted strings with only valid escapes (§7.1)
701
698
  - [ ] Type unquoted primitives: true/false/null → booleans/null, numeric → number, else → string (§4)
702
699
  - [ ] Enforce strict-mode rules when `strict=true` (§14)
703
- - [ ] Accept and ignore optional # length marker (§6)
704
700
  - [ ] Preserve array order and object key order (§2)
705
701
  - [ ] When `expandPaths="safe"`, expansion MUST follow §13.4 (IdentifierSegment-only segments, deep merge, conflict rules)
706
702
  - [ ] When `expandPaths="safe"` with `strict=true`, MUST error on expansion conflicts per §14.5
@@ -827,7 +823,7 @@ count: 2
827
823
  TOON's tabular format generalizes CSV [RFC4180] with several enhancements:
828
824
 
829
825
  Advantages over CSV:
830
- - Explicit array length markers enable validation
826
+ - Explicit array length declarations enable validation
831
827
  - Field names declared in header (no separate header row)
832
828
  - Supports nested structures (CSV is flat-only)
833
829
  - Three delimiter options (comma/tab/pipe) vs CSV's comma-only
@@ -853,7 +849,7 @@ Conversion Guidelines:
853
849
  - CSV headers map to TOON field names
854
850
  - CSV data rows map to TOON tabular rows
855
851
  - CSV string escaping (double-quotes) maps to TOON quoting rules
856
- - CSV row count can be added as array length marker
852
+ - CSV row count can be added as array length declaration
857
853
 
858
854
  ### 17.3 YAML Interoperability
859
855
 
@@ -1007,14 +1003,6 @@ items[2 ]{sku name qty price}:
1007
1003
  tags[3|]: reading|gaming|coding
1008
1004
  ```
1009
1005
 
1010
- Length marker:
1011
- ```
1012
- tags[#3]: reading,gaming,coding
1013
- pairs[#2]:
1014
- - [#2]: a,b
1015
- - [#2]: c,d
1016
- ```
1017
-
1018
1006
  Quoted colons and disambiguation (rows continue; colon is inside quotes):
1019
1007
  ```
1020
1008
  links[2]{id,url}:
@@ -1157,12 +1145,11 @@ These sketches illustrate structure and common decoding helpers. They are inform
1157
1145
  ### B.2 Array Header Parsing
1158
1146
 
1159
1147
  - Locate the first "[ … ]" segment on the line; parse:
1160
- - Optional leading "#" marker (ignored semantically).
1161
1148
  - Length N as decimal integer.
1162
1149
  - Optional delimiter symbol at the end: HTAB or pipe (comma otherwise).
1163
1150
  - If a "{ … }" fields segment occurs between the "]" and the ":", parse field names using the active delimiter; unescape quoted names.
1164
1151
  - Require a colon ":" after the bracket/fields segment.
1165
- - Return the header (key?, length, delimiter, fields?, hasLengthMarker) and any inline values after the colon.
1152
+ - Return the header (key?, length, delimiter, fields?) and any inline values after the colon.
1166
1153
  - Absence of a delimiter symbol in the bracket segment ALWAYS means comma for that header (no inheritance).
1167
1154
 
1168
1155
  ### B.3 parseDelimitedValues
@@ -1231,6 +1218,14 @@ Note: Host-type normalization tests (e.g., BigInt, Date, Set, Map) are language-
1231
1218
 
1232
1219
  ## Appendix D: Document Changelog (Informative)
1233
1220
 
1221
+ ### v2.0 (2025-11-10)
1222
+
1223
+ - Breaking change: Length marker (`#`) prefix in array headers has been completely removed from the specification.
1224
+ - The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only.
1225
+ - Encoders MUST NOT emit `[#N]` format.
1226
+ - Decoders MUST NOT accept `[#N]` format (breaking change from v1.5).
1227
+ - Removed all references to length marker from terminology, grammar, conformance requirements, and parsing helpers.
1228
+
1234
1229
  ### v1.5 (2025-11-08)
1235
1230
 
1236
1231
  - Added optional key folding for encoders: `keyFolding='safe'` mode with `flattenDepth` control (§13.4).
@@ -1291,7 +1286,6 @@ This specification and reference implementation are released under the MIT Licen
1291
1286
  - The reference encoder/decoder test suites implement:
1292
1287
  - Safe-unquoted string rules and delimiter-aware quoting (document vs active delimiter).
1293
1288
  - Header formation and delimiter-aware parsing with active delimiter scoping.
1294
- - Length marker propagation (encoding) and acceptance (decoding).
1295
1289
  - Tabular detection requiring uniform keys and primitive-only values.
1296
1290
  - Objects-as-list-items parsing (+2 nested object rule; +1 siblings).
1297
1291
  - Whitespace invariants for encoding and strict-mode indentation enforcement for decoding.
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@toon-format/spec",
3
3
  "type": "module",
4
- "version": "1.5.1",
4
+ "version": "2.0.0",
5
5
  "packageManager": "pnpm@10.19.0",
6
6
  "description": "Official specification for Token-Oriented Object Notation (TOON)",
7
7
  "author": "Johann Schopplich <hello@johannschopplich.com>",
package/tests/README.md CHANGED
@@ -92,7 +92,6 @@ All test fixtures follow a standard JSON structure defined in [`fixtures.schema.
92
92
  {
93
93
  "delimiter": ",",
94
94
  "indent": 2,
95
- "lengthMarker": "#",
96
95
  "keyFolding": "safe",
97
96
  "flattenDepth": 3
98
97
  }
@@ -100,7 +99,6 @@ All test fixtures follow a standard JSON structure defined in [`fixtures.schema.
100
99
 
101
100
  - `delimiter`: `","` (comma, default), `"\t"` (tab), or `"|"` (pipe). Affects encoder output; decoders parse the delimiter declared in array headers
102
101
  - `indent`: Number of spaces per indentation level (default: `2`)
103
- - `lengthMarker`: Optional. Set to `"#"` to prefix array lengths (e.g., `[#3]`). Omit this property to disable length markers
104
102
  - `keyFolding`: `"off"` (default) or `"safe"`. Enables key folding to collapse single-key object chains into dotted-path notation (v1.5+)
105
103
  - `flattenDepth`: Integer. Maximum depth to fold key chains when `keyFolding` is `"safe"` (default: Infinity). Values less than 2 have no practical folding effect (v1.5+)
106
104
 
@@ -161,7 +159,6 @@ The fixture format is language-agnostic JSON, so you can load and iterate it usi
161
159
  | `arrays-objects.json` | Objects as list items, complex nesting | §9, §10 |
162
160
  | `delimiters.json` | Tab and pipe delimiter options | §11 |
163
161
  | `whitespace.json` | Formatting invariants and indentation | §12 |
164
- | `options.json` | Length marker and delimiter option combinations | §3 |
165
162
  | `key-folding.json` | Key folding with safe mode, depth control, collision avoidance | §13.4 |
166
163
 
167
164
  ### Decoding Tests (`fixtures/decode/`)
@@ -241,14 +241,6 @@
241
241
  "items": [{ "a|b": 1 }, { "a|b": 2 }]
242
242
  },
243
243
  "specSection": "11"
244
- },
245
- {
246
- "name": "accepts length marker with pipe delimiter",
247
- "input": "tags[#3|]: reading|gaming|coding",
248
- "expected": {
249
- "tags": ["reading", "gaming", "coding"]
250
- },
251
- "specSection": "6"
252
244
  }
253
245
  ]
254
246
  }
@@ -198,17 +198,17 @@
198
198
  {
199
199
  "name": "encodes folded chains preserving sibling field order",
200
200
  "input": {
201
- "folded": {
202
- "path": {
203
- "value": 1
201
+ "first": {
202
+ "second": {
203
+ "third": 1
204
204
  }
205
205
  },
206
- "normal": 2,
207
- "nested": {
208
- "x": 3
206
+ "simple": 2,
207
+ "short": {
208
+ "path": 3
209
209
  }
210
210
  },
211
- "expected": "folded.path.value: 1\nnormal: 2\nnested.x: 3",
211
+ "expected": "first.second.third: 1\nsimple: 2\nshort.path: 3",
212
212
  "options": {
213
213
  "keyFolding": "safe"
214
214
  },
@@ -10,7 +10,7 @@
10
10
  "type": "string",
11
11
  "description": "TOON specification version these tests target",
12
12
  "pattern": "^\\d+\\.\\d+$",
13
- "examples": ["1.0", "1.3"]
13
+ "examples": ["1.0", "1.6"]
14
14
  },
15
15
  "category": {
16
16
  "type": "string",
@@ -67,11 +67,6 @@
67
67
  "minimum": 1,
68
68
  "default": 2
69
69
  },
70
- "lengthMarker": {
71
- "type": "string",
72
- "const": "#",
73
- "description": "Optional marker to prefix array lengths (encode only). Omit to disable the marker."
74
- },
75
70
  "strict": {
76
71
  "type": "boolean",
77
72
  "description": "Enable strict validation (decode only)",
@@ -1,88 +0,0 @@
1
- {
2
- "version": "1.4",
3
- "category": "encode",
4
- "description": "Encoding options - lengthMarker option and combinations with delimiters",
5
- "tests": [
6
- {
7
- "name": "adds length marker to primitive arrays",
8
- "input": {
9
- "tags": ["reading", "gaming", "coding"]
10
- },
11
- "expected": "tags[#3]: reading,gaming,coding",
12
- "options": {
13
- "lengthMarker": "#"
14
- },
15
- "specSection": "3"
16
- },
17
- {
18
- "name": "adds length marker to empty arrays",
19
- "input": {
20
- "items": []
21
- },
22
- "expected": "items[#0]:",
23
- "options": {
24
- "lengthMarker": "#"
25
- },
26
- "specSection": "3"
27
- },
28
- {
29
- "name": "adds length marker to tabular arrays",
30
- "input": {
31
- "items": [
32
- { "sku": "A1", "qty": 2, "price": 9.99 },
33
- { "sku": "B2", "qty": 1, "price": 14.5 }
34
- ]
35
- },
36
- "expected": "items[#2]{sku,qty,price}:\n A1,2,9.99\n B2,1,14.5",
37
- "options": {
38
- "lengthMarker": "#"
39
- },
40
- "specSection": "3"
41
- },
42
- {
43
- "name": "adds length marker to nested arrays",
44
- "input": {
45
- "pairs": [["a", "b"], ["c", "d"]]
46
- },
47
- "expected": "pairs[#2]:\n - [#2]: a,b\n - [#2]: c,d",
48
- "options": {
49
- "lengthMarker": "#"
50
- },
51
- "specSection": "3"
52
- },
53
- {
54
- "name": "combines length marker with pipe delimiter",
55
- "input": {
56
- "tags": ["reading", "gaming", "coding"]
57
- },
58
- "expected": "tags[#3|]: reading|gaming|coding",
59
- "options": {
60
- "lengthMarker": "#",
61
- "delimiter": "|"
62
- },
63
- "specSection": "3"
64
- },
65
- {
66
- "name": "combines length marker with tab delimiter",
67
- "input": {
68
- "tags": ["reading", "gaming", "coding"]
69
- },
70
- "expected": "tags[#3\t]: reading\tgaming\tcoding",
71
- "options": {
72
- "lengthMarker": "#",
73
- "delimiter": "\t"
74
- },
75
- "specSection": "3"
76
- },
77
- {
78
- "name": "default lengthMarker is empty (no marker)",
79
- "input": {
80
- "tags": ["reading", "gaming", "coding"]
81
- },
82
- "expected": "tags[3]: reading,gaming,coding",
83
- "options": {},
84
- "specSection": "3",
85
- "note": "Default behavior without lengthMarker option"
86
- }
87
- ]
88
- }