npm - @toon-format/spec - Versions diffs - 2.0.1 → 3.0.0 - Mend

@toon-format/spec 2.0.1 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CHANGELOG.md +55 -38
package/README.md +19 -3
package/SPEC.md +70 -70
package/package.json +1 -1
package/tests/fixtures/decode/arrays-nested.json +22 -5
package/tests/fixtures/encode/arrays-nested.json +1 -1
package/tests/fixtures/encode/arrays-objects.json +28 -8

package/CHANGELOG.md CHANGED Viewed

@@ -5,87 +5,104 @@ All notable changes to the TOON specification will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [3.0] - 2025-11-24
+### Breaking Changes
+- Standardized encoding for list-item objects whose first field is a tabular array (§10):
+  - Encoders MUST emit `- key[N]{fields}:` on the hyphen line.
+  - Tabular rows MUST appear at depth +2 relative to the hyphen line.
+  - All other fields of the same object MUST appear at depth +1.
+  - The v2.0 shallow form (rows and fields at the same depth) and the v2.1 bare-hyphen form are no longer normative and MUST NOT be emitted by conforming encoders.
+### Changed
+- Encoding/decoding rules (§10) simplified to describe only the YAML-style pattern; legacy layouts are treated as generic nesting and are not covered by conformance tests.
+- Nested tabular list-item example in Appendix A updated to the canonical v3.0 form.
+### Migration from v2.1
+- Update encoders to emit the YAML-style form for list-item objects whose first field is a tabular array.
+- If you rely on v2.0/v2.1 layouts, keep decoder compatibility in non-strict or implementation-defined modes; the spec no longer requires or tests these patterns.
+- Optionally regenerate existing `.toon` files for consistent v3 formatting.
+## [2.1] - 2025-11-23
+### Changed
+- Canonical encoding for objects as list items (§10):
+  - Encoders SHOULD emit `- key[N]{fields}:` only when the list-item object has exactly one field and that field is a tabular array.
+  - In all other cases, encoders SHOULD emit a bare `-` line and place all fields at depth +1; tabular array headers then appear at depth +1 and their rows at depth +2.
 ## [2.0] - 2025-11-10
 ### Breaking Changes
-- **Removed:** Length marker (`#`) prefix in array headers has been completely removed from the specification
-- The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only
-- Encoders MUST NOT emit `[#N]` format
-- Decoders MUST NOT accept `[#N]` format (breaking change from v1.5)
+- Removed `[#N]` length-marker syntax in array headers; `[N]` is now the only valid format.
+- Encoders MUST NOT emit `[#N]`; decoders MUST reject it.
 ### Removed
-- All references to length marker from terminology (§1.4), header syntax (§6), ABNF grammar, conformance requirements (§13.2), and parsing helpers (Appendix B)
-- `lengthMarker` encoder option removed from all implementations
-- Length marker test fixtures removed
+- The `lengthMarker` encoder option and any CLI flags exposing it.
 ### Migration from v1.5
-- Update decoder implementations to reject `[#N]` syntax
-- Convert any existing `.toon` files using `[#N]` format to `[N]` format
-- Remove `lengthMarker` option from encoder configurations
-- Remove `--length-marker` CLI flags if present
+- Update decoders to reject `[#N]` syntax.
+- Convert existing `.toon` files using `[#N]` to `[N]`.
+- Remove `lengthMarker` configuration and CLI options.
 ## [1.5] - 2025-11-08
 ### Added
-- Optional key folding for encoders: `keyFolding="safe"` mode with `flattenDepth` control to collapse single-key object chains into dotted-path notation (§13.4)
-- Optional path expansion for decoders: `expandPaths="safe"` mode to split dotted keys into nested objects, with conflict resolution tied to `strict` option (§13.4, §14.5)
-- IdentifierSegment terminology and path separator definition (fixed to `"."` in v1.5) (§1.9)
-- Deep-merge semantics for path expansion: recursive merge for objects, error on conflict when `strict=true`, last-write-wins (LWW) when `strict=false` (§13.4)
+- Optional key folding for encoders: `keyFolding="safe"` with `flattenDepth` to collapse single-key object chains into dotted paths (§13.4).
+- Optional path expansion for decoders: `expandPaths="safe"` to split dotted keys into nested objects with deep-merge semantics and conflict handling tied to `strict` (§13.4, §14.5).
+- IdentifierSegment terminology and fixed `"."` path separator for safe folding/expansion (§1.9).
 ### Changed
-- Both new features default to OFF and are fully backward-compatible
-- Safe-mode folding requires IdentifierSegment validation, collision avoidance, and no quoting
+- Safe-mode folding requires IdentifierSegment-only segments, no path separator in segments, no quoting, and collision avoidance.
+- Both features default to `off` and are backward-compatible.
 ## [1.4] - 2025-11-05
 ### Changed
-- Removed JavaScript-specific normalization details from specification; replaced with language-agnostic requirements (Section 3)
-- Defined canonical number format for encoders: no exponent notation, no trailing zeros, no leading zeros except "0" (Section 2)
-- Clarified decoder handling of exponent notation and out-of-range numbers (Section 2)
-- Expanded `\w` regex notation to explicit character class `[A-Za-z0-9_]` for cross-language clarity (Section 7.3)
-- Clarified non-strict mode tab handling as implementation-defined (Section 12)
+- Generalized normalization rules and defined canonical number format for encoders (no exponent notation, no trailing zeros, no leading zeros except `"0"`), plus decoder handling of exponent forms and out-of-range numbers (§2-§3).
+- Replaced `\w` with explicit `[A-Za-z0-9_]` in key regexes for cross-language clarity (§7.3).
+- Clarified non-strict mode tab handling as implementation-defined (§12).
 ### Added
-- Appendix G: Host Type Normalization Examples with guidance for Go, JavaScript, Python, and Rust implementations
+- Appendix G with host-type normalization examples for Go, JavaScript, Python, and Rust.
 ## [1.3] - 2025-10-31
 ### Added
-- Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (15-17 digits), all implementations MUST preserve round-trip fidelity (Section 2)
-- RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (Section 6)
+- Numeric precision requirements: JavaScript implementations SHOULD use `Number.toString()` precision (15–17 digits); all implementations MUST preserve round-trip fidelity (§2).
+- RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (§6).
 ## [1.2] - 2025-10-29
 ### Changed
-- Clarified delimiter scoping behavior between array headers
-- Tightened strict-mode indentation requirements: leading spaces MUST be exact multiples of indentSize; tabs in indentation MUST error
-- Defined blank-line and trailing-newline decoding behavior with explicit skipping rules outside arrays
-- Clarified hyphen-based quoting: "-" or any string starting with "-" MUST be quoted
-- Clarified BigInt normalization: values outside safe integer range are converted to quoted decimal strings
-- Clarified row/key disambiguation: uses first unquoted delimiter vs colon position
+- Tightened delimiter scoping, indentation, blank-line handling, and hyphen-based quoting rules (§11-§12).
+- Clarified BigInt normalization (out-of-range values → quoted decimal strings) and row/key disambiguation (first unquoted delimiter vs colon) (§2, §9.3).
 ## [1.1] - 2025-10-29
 ### Added
-- Strict-mode rules
-- Delimiter-aware parsing
-- Decoder options (indent, strict)
+- Strict-mode rules.
+- Delimiter-aware parsing.
+- Decoder options (`indent`, `strict`).
 ## [1.0] - 2025-10-28
 ### Added
-- Initial specification release
-- Encoding normalization rules
-- Decoding interpretation guidelines
-- Conformance requirements
+- Initial specification release.
+- Encoding normalization rules.
+- Decoding interpretation guidelines.
+- Conformance requirements.

package/README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # TOON Format Specification
-[![SPEC v2.0](https://img.shields.io/badge/spec-v2.0-lightgrey)](./SPEC.md)
-[![Tests](https://img.shields.io/badge/tests-342-green)](./tests/fixtures/)
+[![SPEC v3.0](https://img.shields.io/badge/spec-v3.0-lightgrey)](./SPEC.md)
+[![Tests](https://img.shields.io/badge/tests-345-green)](./tests/fixtures/)
 [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
 This repository contains the official specification for **Token-Oriented Object Notation (TOON)**, a compact, human-readable encoding of the JSON data model for LLM prompts. It provides a lossless serialization of the same objects, arrays, and primitives as JSON, but in a syntax that minimizes tokens and makes structure easy for models to follow.
@@ -10,7 +10,7 @@ This repository contains the official specification for **Token-Oriented Object
 [→ Read the full specification (SPEC.md)](./SPEC.md)
-- **Version:** 2.0 (2025-11-10)
+- **Version:** 3.0 (2025-11-24)
 - **Status:** Working Draft
 - **License:** MIT
@@ -122,6 +122,22 @@ The [tests/fixtures/](./tests/fixtures/) directory contains **language-agnostic
 See [tests/README.md](./tests/README.md) for detailed fixture format and usage instructions.
+## Media Type & File Extension
+TOON defines a provisional media type (see §18.2 of the specification):
+- **Media type:** `text/toon` (provisional, pending IANA registration)
+- **File extension:** `.toon`
+- **Charset:** Always UTF-8
+For HTTP usage:
+```http
+Content-Type: text/toon
+```
+See the full [IANA Considerations section](SPEC.md#18-iana-considerations) for details.
 ## Contributing
 We welcome contributions to improve the specification! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for:

package/SPEC.md CHANGED Viewed

@@ -2,9 +2,9 @@
 ## Token-Oriented Object Notation
-**Version:** 2.0
+**Version:** 3.0
-**Date:** 2025-11-10
+**Date:** 2025-11-24
 **Status:** Working Draft
@@ -20,7 +20,7 @@ Token-Oriented Object Notation (TOON) is a line-oriented, indentation-based text
 ## Status of This Document
-This document is a Working Draft v2.0 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
+This document is a Working Draft v3.0 and may be updated, replaced, or obsoleted. Implementers should monitor the canonical repository at https://github.com/toon-format/spec for changes.
 This specification is stable for implementation but not yet finalized. Breaking changes may occur in future major versions.
@@ -227,12 +227,11 @@ Implementations that fail to conform to any MUST or REQUIRED level requirement a
 ## 3. Encoding Normalization (Reference Encoder)
-Encoders MUST normalize non-JSON values to the JSON data model before encoding:
+Encoders MUST normalize non-JSON values to the JSON data model before encoding. The mapping from host-specific types to JSON model is implementation-defined and MUST be documented.
 - Number:
   - Finite → number (canonical decimal form per Section 2). -0 → 0.
   - NaN, +Infinity, -Infinity → null.
-- Non-JSON types MUST be normalized to the JSON data model (object, array, string, number, boolean, or null) before encoding. The mapping from host-specific types to JSON model is implementation-defined and MUST be documented.
 - Examples of host-type normalization (non-normative):
   - Date/time objects → ISO 8601 string representation.
   - Set-like collections → array.
@@ -384,9 +383,9 @@ A string value MUST be quoted if any of the following is true:
 - It contains a colon (:), double quote ("), or backslash (\).
 - It contains brackets or braces ([, ], {, }).
 - It contains control characters: newline, carriage return, or tab.
-- It contains the relevant delimiter:
-  - Inside array scope: the active delimiter (Section 1).
-  - Outside array scope: the document delimiter (Section 1).
+- It contains the relevant delimiter (see §11 for complete delimiter rules):
+  - For inline array values and tabular row cells: the active delimiter from the nearest array header.
+  - For object field values (key: value): the document delimiter, even when the object is within an array's scope.
 - It equals "-" or starts with "-" (any hyphen at position 0).
 Otherwise, the string MAY be emitted without quotes. Unicode, emoji, and strings with internal (non-leading/trailing) spaces are safe unquoted provided they do not violate the conditions.
@@ -403,12 +402,10 @@ Encoders MAY perform key folding when enabled (see §13.4 for complete folding r
 ### 7.4 Decoding Rules for Strings and Keys (Decoding)
-- Quoted strings and keys MUST be unescaped per Section 7.1; any other escape MUST error. Quoted primitives remain strings.
-- Unquoted values:
-  - true/false/null → boolean/null
-  - Numeric tokens → numbers (with the leading-zero rule in Section 4)
-  - Otherwise → strings
-- Keys (quoted or unquoted) MUST be followed by ":"; missing colon MUST error.
+Decoding of value tokens follows §4 (unquoted type inference, quoted strings, numeric rules). This section adds key-specific requirements:
+- Quoted keys MUST be unescaped per Section 7.1; any other escape MUST error.
+- Keys (quoted or unquoted) MUST be followed by ":"; missing colon MUST error (see also §14.2).
 ## 8. Objects
@@ -421,7 +418,6 @@ Encoders MAY perform key folding when enabled (see §13.4 for complete folding r
 - Decoding:
   - A line "key:" with nothing after the colon at depth d opens an object; subsequent lines at depth > d belong to that object until the depth decreases to ≤ d.
   - Lines "key: value" at the same depth are sibling fields.
-  - Missing colon after a key MUST error.
 ## 9. Arrays
@@ -474,6 +470,7 @@ Decoding:
     - Delimiter before colon → row.
     - Colon before delimiter → key-value line (end of rows).
   - If a line has an unquoted colon but no unquoted active delimiter → key-value line (end of rows).
+- When a tabular array appears as the first field of a list-item object, indentation is governed by Section 10.
 ### 9.4 Mixed / Non-Uniform Arrays — Expanded List
@@ -499,20 +496,18 @@ Decoding:
 For an object appearing as a list item:
 - Empty object list item: a single "-" at the list-item indentation level.
-- First field on the hyphen line:
-  - Primitive: - key: value
-  - Primitive array: - key[M<delim?>]: v1<delim>…
-  - Tabular array: - key[N<delim?>]{fields}:
-    - Followed by tabular rows at depth +1 (relative to the hyphen line).
-  - Non-uniform array: - key[N<delim?>]:
-    - Followed by list items at depth +1.
-  - Object: - key:
-    - Nested object fields appear at depth +2 (i.e., one deeper than subsequent sibling fields of the same list item).
-- Remaining fields of the same object appear at depth +1 under the hyphen line in encounter order, using normal object field rules.
-Decoding:
-- The first field is parsed from the hyphen line. If it is a nested object (- key:), nested fields are at +2 relative to the hyphen line; subsequent fields of the same list item are at +1.
-- If the first field is a tabular header on the hyphen line, its rows are at +1; subsequent sibling fields continue at +1 after the rows.
+- Encoding (normative):
+  - When a list-item object has a tabular array (Section 9.3) as its first field in encounter order, encoders MUST emit the tabular header on the hyphen line:
+    - The hyphen and tabular header appear on the same line at the list-item depth: - key[N<delim?>]{fields}:
+    - Tabular rows MUST appear at depth +2 (relative to the hyphen line).
+    - All other fields of the same object MUST appear at depth +1 under the hyphen line, in encounter order, using normal object field rules (Section 8).
+    - Encoders MUST NOT emit tabular rows at depth +1 or sibling fields at the same depth as rows when the first field is a tabular array.
+  - For all other cases (first field is not a tabular array), encoders SHOULD place the first field on the hyphen line. A bare hyphen on its own line is used only for empty list-item objects.
+- Decoding (normative):
+  - When a decoder encounters a list-item line of the form - key[N<delim?>]{fields}: at depth d, it MUST treat this as the start of a tabular array field named key in the list-item object.
+  - Lines at depth d+2 that conform to tabular row syntax (Section 9.3) are rows of that tabular array.
+  - Lines at depth d+1 are additional fields of the same list-item object; the presence of a line at depth d+1 after rows terminates the rows.
+  - All other object-as-list-item patterns (bare hyphen, first field on hyphen line for non-tabular values) are decoded according to the general rules in Section 8 and Section 9.
 ## 11. Delimiters
@@ -520,19 +515,25 @@ Decoding:
   - Comma (default): header omits the delimiter symbol.
   - Tab: header includes HTAB inside brackets and braces (e.g., [N<TAB>], {a<TAB>b}); rows/inline arrays use tabs.
   - Pipe: header includes "|" inside brackets and braces; rows/inline arrays use "|".
-- Document vs Active delimiter:
-  - Encoders select a document delimiter (option) that influences quoting for all object values (key: value) throughout the document.
-  - Inside an array header's scope, the active delimiter governs splitting and quoting only for inline arrays and tabular rows that the header introduces. Object values (key: value) follow document-delimiter quoting rules regardless of array scope.
-- Delimiter-aware quoting (encoding):
-  - Inline array values and tabular row cells: strings containing the active delimiter MUST be quoted to avoid splitting.
-  - Object values (key: value): encoders use the document delimiter to decide delimiter-aware quoting, regardless of whether the object appears within an array's scope.
-  - Strings containing non-active delimiters do not require quoting unless another quoting condition applies (Section 7.2).
-- Delimiter-aware parsing (decoding):
-  - Inline arrays and tabular rows MUST be split only on the active delimiter declared by the nearest array header.
+### 11.1 Encoding Rules (Normative for Encoders)
+- Document delimiter: Encoders select a document delimiter (option: comma, tab, pipe; default comma) that influences quoting for all object field values (key: value) throughout the document.
+- Active delimiter: Inside an array header's scope, the active delimiter governs quoting only for inline array values and tabular row cells.
+- Delimiter-aware quoting:
+  - Inline array values and tabular row cells: strings containing the active delimiter MUST be quoted.
+  - Object field values (key: value): encoders use the document delimiter to decide delimiter-aware quoting, regardless of whether the object appears within an array's scope.
+  - Strings containing non-active delimiters do not require quoting unless another condition applies (§7.2).
+### 11.2 Decoding Rules (Normative for Decoders)
+- Active delimiter: Decoders use only the active delimiter declared by the nearest array header to split inline arrays and tabular rows.
+- Delimiter-aware parsing:
+  - Inline arrays and tabular rows MUST be split only on the active delimiter.
   - Splitting MUST preserve empty tokens; surrounding spaces are trimmed, and empty tokens decode to the empty string.
-  - Strings containing the active delimiter MUST be quoted to avoid splitting; non-active delimiters MUST NOT cause splits.
   - Nested headers may change the active delimiter; decoding MUST use the delimiter declared by the nearest header.
-  - If the bracket declares tab or pipe, the same symbol MUST be used in the fields segment and for splitting all rows/values in that scope.
+  - If the bracket declares tab or pipe, the same symbol MUST be used in the fields segment and for splitting all rows/values in that scope (§6).
+- Object field values (key: value): Decoders parse the entire post-colon token as a single value; document delimiter is not a decoder concept.
 ## 12. Indentation and Whitespace
@@ -730,12 +731,14 @@ When strict mode is enabled (default), decoders MUST error on the following cond
 ### 14.3 Indentation Errors
+See §12 for indentation semantics. In strict mode, decoders MUST error on:
 - Leading spaces not a multiple of indentSize.
 - Any tab used in indentation (tabs allowed in quoted strings and as HTAB delimiter).
 ### 14.4 Structural Errors
-- Blank lines inside arrays/tabular rows.
+See §12 for blank line semantics. In strict mode, decoders MUST error on:
+- Blank lines inside arrays/tabular rows (between the first and last item/row).
 For root-form rules, including handling of empty documents, see §5.
@@ -894,7 +897,11 @@ This specification does not request IANA registration at this time, as the forma
 ### 18.2 Provisional Media Type
-The following provisional media type designation is RECOMMENDED for experimental implementations:
+Until IANA registration is completed, implementations SHOULD use:
+- Media type: `text/toon`
+- File extension: `.toon`
+Full designation details:
 Type name: text
@@ -989,11 +996,13 @@ Nested tabular inside a list item:
 ```
 items[1]:
   - users[2]{id,name}:
-    1,Ada
-    2,Bob
+      1,Ada
+      2,Bob
     status: active
 ```
+Note: When a list-item object has a tabular array as its first field, encoders emit the tabular header on the hyphen line with rows at depth +2 and other fields at depth +1. This is the canonical encoding for list-item objects whose first field is a tabular array.
 Delimiter variations:
 ```
 items[2	]{sku	name	qty	price}:
@@ -1218,52 +1227,43 @@ Note: Host-type normalization tests (e.g., BigInt, Date, Set, Map) are language-
 ## Appendix D: Document Changelog (Informative)
+This appendix summarizes major changes between spec versions. For the complete changelog, see [`CHANGELOG.md`](./CHANGELOG.md) in the specification repository.
+### v3.0 (2025-11-24)
+- Standardized encoding for list-item objects whose first field is a tabular array (§10).
+### v2.1 (2025-11-23)
+- Tightened canonical encoding for objects as list items (§10): bare `-` for multi-field objects, compact `- key[N]{fields}:` only for single-field tabular arrays, to improve visual consistency and LLM readability.
 ### v2.0 (2025-11-10)
-- Breaking change: Length marker (`#`) prefix in array headers has been completely removed from the specification.
-- The `[#N]` format is no longer valid syntax. All array headers MUST use `[N]` format only.
-- Encoders MUST NOT emit `[#N]` format.
-- Decoders MUST NOT accept `[#N]` format (breaking change from v1.5).
-- Removed all references to length marker from terminology, grammar, conformance requirements, and parsing helpers.
+- Removed `[#N]` length-marker syntax from array headers; `[N]` is now the only valid form.
 ### v1.5 (2025-11-08)
-- Added optional key folding for encoders: `keyFolding='safe'` mode with `flattenDepth` control (§13.4).
-- Added optional path expansion for decoders: `expandPaths='safe'` mode with conflict resolution tied to existing `strict` option (§13.4).
-- Defined safe-mode requirements for folding: IdentifierSegment validation, no path separator in segments, collision avoidance, no quoting required (§7.3, §13.4).
-- Specified deep-merge semantics for expansion: recursive merge for objects; conflict policy (error in strict mode, LWW when strict=false) for non-objects (§13.4).
-- Added strict-mode error category for path expansion conflicts (§14.5).
-- Both features default to OFF; fully backward-compatible.
+- Added optional key folding (`keyFolding="safe"`) and path expansion (`expandPaths="safe"`) with deep-merge semantics and strict-mode conflict handling (§13.4, §14.5).
 ### v1.4 (2025-11-05)
-- Removed JavaScript-specific normalization details; replaced with language-agnostic requirements (Section 3).
-- Defined canonical number format for encoders and decoder acceptance rules (Section 2).
-- Added Appendix G with host-type normalization examples for Go, JavaScript, Python, and Rust.
-- Clarified non-strict mode tab handling as implementation-defined (Section 12).
-- Expanded regex notation for cross-language clarity (Section 7.3).
+- Generalized normalization and numeric canonicalization rules, and added host-type normalization guidance (Appendix G).
 ### v1.3 (2025-10-31)
-- Added numeric precision requirements: JavaScript implementations SHOULD use Number.toString() precision (15-17 digits), all implementations MUST preserve round-trip fidelity (Section 2).
-- Added RFC 5234 core rules (ALPHA, DIGIT, DQUOTE, HTAB, LF, SP) to ABNF grammar definitions (Section 6).
+- Added numeric precision guidance and ABNF core rules for headers and keys (§2, §6).
 ### v1.2 (2025-10-29)
-- Clarified delimiter scoping behavior between array headers.
-- Tightened strict-mode indentation requirements: leading spaces MUST be exact multiples of indentSize; tabs in indentation MUST error.
-- Defined blank-line and trailing-newline decoding behavior with explicit skipping rules outside arrays.
-- Clarified hyphen-based quoting: "-" or any string starting with "-" MUST be quoted.
-- Clarified BigInt normalization: values outside safe integer range are converted to quoted decimal strings.
-- Clarified row/key disambiguation: uses first unquoted delimiter vs colon position.
+- Tightened delimiter scoping, indentation, blank-line handling, hyphen-based quoting, BigInt normalization, and row/key disambiguation rules (§2, §9, §11-§12).
 ### v1.1 (2025-10-29)
-Added strict-mode rules, delimiter-aware parsing, and decoder options (indent, strict).
+- Introduced strict-mode validation, delimiter-aware parsing, and decoder options (indent, strict).
 ### v1.0 (2025-10-28)
-Initial encoding, normalization, and conformance rules.
+- Initial specification: encoding normalization, decoding interpretation, and conformance requirements.
 ## Appendix E: Acknowledgments and License

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@toon-format/spec",
   "type": "module",
-  "version": "2.0.1",
+  "version": "3.0.0",
   "packageManager": "pnpm@10.19.0",
   "description": "Official specification for Token-Oriented Object Notation (TOON)",
   "author": "Johann Schopplich <hello@johannschopplich.com>",

package/tests/fixtures/decode/arrays-nested.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "1.4",
+  "version": "3.0",
   "category": "decode",
   "description": "Nested and mixed array decoding - list format, arrays of arrays, root arrays, mixed types",
   "tests": [
@@ -52,8 +52,8 @@
       "specSection": "9.4"
     },
     {
-      "name": "parses nested tabular arrays as first field on hyphen line",
-      "input": "items[1]:\n  - users[2]{id,name}:\n    1,Ada\n    2,Bob\n    status: active",
+      "name": "parses list items whose first field is a tabular array",
+      "input": "items[1]:\n  - users[2]{id,name}:\n      1,Ada\n      2,Bob\n    status: active",
       "expected": {
         "items": [
           {
@@ -65,7 +65,24 @@
           }
         ]
       },
-      "specSection": "10"
+      "specSection": "10",
+      "note": "Canonical encoding: tabular header on hyphen line, rows at depth +2, sibling fields at depth +1"
+    },
+    {
+      "name": "parses single-field list-item object with tabular array",
+      "input": "items[1]:\n  - users[2]{id,name}:\n      1,Ada\n      2,Bob",
+      "expected": {
+        "items": [
+          {
+            "users": [
+              { "id": 1, "name": "Ada" },
+              { "id": 2, "name": "Bob" }
+            ]
+          }
+        ]
+      },
+      "specSection": "10",
+      "note": "Single-field list-item object: only the tabular array, no sibling fields"
     },
     {
       "name": "parses objects containing arrays (including empty arrays) in list format",
@@ -79,7 +96,7 @@
     },
     {
       "name": "parses arrays of arrays within objects",
-      "input": "items[1]:\n  - matrix[2]:\n    - [2]: 1,2\n    - [2]: 3,4\n    name: grid",
+      "input": "items[1]:\n  - matrix[2]:\n      - [2]: 1,2\n      - [2]: 3,4\n    name: grid",
       "expected": {
         "items": [
           { "matrix": [[1, 2], [3, 4]], "name": "grid" }

package/tests/fixtures/encode/arrays-nested.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "1.4",
+  "version": "3.0",
   "category": "encode",
   "description": "Nested and mixed array encoding - arrays of arrays, mixed type arrays, root arrays",
   "tests": [

package/tests/fixtures/encode/arrays-objects.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "1.4",
+  "version": "3.0",
   "category": "encode",
   "description": "Arrays of objects encoding - list format for non-uniform objects and complex structures",
   "tests": [
@@ -47,7 +47,7 @@
           { "matrix": [[1, 2], [3, 4]], "name": "grid" }
         ]
       },
-      "expected": "items[1]:\n  - matrix[2]:\n    - [2]: 1,2\n    - [2]: 3,4\n    name: grid",
+      "expected": "items[1]:\n  - matrix[2]:\n      - [2]: 1,2\n      - [2]: 3,4\n    name: grid",
       "specSection": "10"
     },
     {
@@ -57,8 +57,9 @@
           { "users": [{ "id": 1, "name": "Ada" }, { "id": 2, "name": "Bob" }], "status": "active" }
         ]
       },
-      "expected": "items[1]:\n  - users[2]{id,name}:\n    1,Ada\n    2,Bob\n    status: active",
-      "specSection": "10"
+      "expected": "items[1]:\n  - users[2]{id,name}:\n      1,Ada\n      2,Bob\n    status: active",
+      "specSection": "10",
+      "note": "YAML-style encoding for list-item objects with tabular array as first field"
     },
     {
       "name": "uses list format for nested object arrays with mismatched keys",
@@ -67,7 +68,7 @@
           { "users": [{ "id": 1, "name": "Ada" }, { "id": 2 }], "status": "active" }
         ]
       },
-      "expected": "items[1]:\n  - users[2]:\n    - id: 1\n      name: Ada\n    - id: 2\n    status: active",
+      "expected": "items[1]:\n  - users[2]:\n      - id: 1\n        name: Ada\n      - id: 2\n    status: active",
       "specSection": "10"
     },
     {
@@ -97,12 +98,22 @@
       "specSection": "10"
     },
     {
-      "name": "places first field of nested tabular arrays on hyphen line",
+      "name": "uses canonical encoding for multi-field list-item objects with tabular arrays",
       "input": {
         "items": [{ "users": [{ "id": 1 }, { "id": 2 }], "note": "x" }]
       },
-      "expected": "items[1]:\n  - users[2]{id}:\n    1\n    2\n    note: x",
-      "specSection": "10"
+      "expected": "items[1]:\n  - users[2]{id}:\n      1\n      2\n    note: x",
+      "specSection": "10",
+      "note": "Tabular header on hyphen line with rows at depth +2 and sibling fields at depth +1"
+    },
+    {
+      "name": "uses canonical encoding for single-field list-item tabular arrays",
+      "input": {
+        "items": [{ "users": [{ "id": 1, "name": "Ada" }, { "id": 2, "name": "Bob" }] }]
+      },
+      "expected": "items[1]:\n  - users[2]{id,name}:\n      1,Ada\n      2,Bob",
+      "specSection": "10",
+      "note": "Tabular header on hyphen line with rows at depth +2"
     },
     {
       "name": "places empty arrays on hyphen line when first",
@@ -112,6 +123,15 @@
       "expected": "items[1]:\n  - data[0]:\n    name: x",
       "specSection": "10"
     },
+    {
+      "name": "encodes empty object list items as bare hyphen",
+      "input": {
+        "items": ["first", "second", {}]
+      },
+      "expected": "items[3]:\n  - first\n  - second\n  -",
+      "specSection": "10",
+      "note": "Empty object list items encode as a single \"-\" line at the list-item depth"
+    },
     {
       "name": "uses field order from first object for tabular headers",
       "input": {