docs: update spec page to match latest version and sections

This commit is contained in:
Johann Schopplich
2025-11-25 10:05:12 +01:00
parent b9e3593cd9
commit f882cb1153

View File

@@ -9,75 +9,91 @@ You don't need this page to *use* TOON. It's mainly for implementers and contrib
## Current Version ## Current Version
**Spec v{{ $spec.version }}** (2025-11-24) is the current stable version. **Spec v{{ $spec.version }}** (2025-11-24) is the current published Working Draft. It is stable for implementation but not yet finalized; see "Status of This Document" in the spec for details.
The spec defines a provisional media type and file extension in §18.2: ## Media Type & File Extension
- **Media type:** `text/toon` (provisional, UTF-8 only) The spec defines a provisional media type and file extension in [§18.2](https://github.com/toon-format/spec/blob/main/SPEC.md#182-provisional-media-type):
- **Media type:** `text/toon` (provisional, not yet IANAregistered; UTF8 only)
- **File extension:** `.toon` - **File extension:** `.toon`
TOON documents are always UTF8 with LF (`\n`) line endings; the optional `charset` parameter, when present, MUST be `utf-8` per the spec.
## Guided Tour of the Spec ## Guided Tour of the Spec
### Core Concepts ### Core Concepts
**[§1 Terminology and Conventions](https://github.com/toon-format/spec/blob/main/SPEC.md#1-terminology-and-conventions)** [§1 Terminology and Conventions](https://github.com/toon-format/spec/blob/main/SPEC.md#1-terminology-and-conventions):
Defines key terms like "indentation level", "active delimiter", "strict mode", and RFC2119 keywords (MUST, SHOULD, MAY). Defines key terms like "indentation level", "active delimiter", "strict mode", and RFC2119 keywords (MUST, SHOULD, MAY).
**[§2 Data Model](https://github.com/toon-format/spec/blob/main/SPEC.md#2-data-model)** [§2 Data Model](https://github.com/toon-format/spec/blob/main/SPEC.md#2-data-model):
Specifies the JSON data model (objects, arrays, primitives), array/object ordering requirements, and canonical number formatting (no exponent notation, no leading/trailing zeros). Specifies the JSON data model (objects, arrays, primitives), array/object ordering requirements, and canonical number formatting (no exponent notation, no leading/trailing zeros).
**[§3 Encoding Normalization](https://github.com/toon-format/spec/blob/main/SPEC.md#3-encoding-normalization-reference-encoder)** [§3 Encoding Normalization](https://github.com/toon-format/spec/blob/main/SPEC.md#3-encoding-normalization-reference-encoder):
Defines how non-JSON types (Date, BigInt, NaN, Infinity, undefined, etc.) are normalized before encoding. Required reading for encoder implementers. Defines how non-JSON types (Date, BigInt, NaN, Infinity, undefined, etc.) are normalized before encoding. Required reading for encoder implementers.
**[§4 Decoding Interpretation](https://github.com/toon-format/spec/blob/main/SPEC.md#4-decoding-interpretation-reference-decoder)** [§4 Decoding Interpretation](https://github.com/toon-format/spec/blob/main/SPEC.md#4-decoding-interpretation-reference-decoder):
Specifies how decoders map text tokens to host values (quoted strings, unquoted primitives, numeric parsing with leading-zero handling). Decoders default to strict mode (`strict = true`) in the reference implementation; strict-mode errors are enumerated in §14. Specifies how decoders map text tokens to host values (quoted strings, unquoted primitives, numeric parsing with leading-zero handling). Decoders default to strict mode (`strict = true`) in the reference implementation; strict-mode errors are enumerated in §14.
### Syntax Rules ### Syntax Rules
**[§5 Concrete Syntax and Root Form](https://github.com/toon-format/spec/blob/main/SPEC.md#5-concrete-syntax-and-root-form)** [§5 Concrete Syntax and Root Form](https://github.com/toon-format/spec/blob/main/SPEC.md#5-concrete-syntax-and-root-form):
Defines TOON's line-oriented, indentation-based notation and how to determine whether the root is an object, array, or primitive. Defines TOON's line-oriented, indentation-based notation and how to determine whether the root is an object, array, or primitive.
**[§6 Header Syntax](https://github.com/toon-format/spec/blob/main/SPEC.md#6-header-syntax-normative)** [§6 Header Syntax](https://github.com/toon-format/spec/blob/main/SPEC.md#6-header-syntax-normative):
Normative ABNF grammar for array headers: `key[N<delim?>]{fields}:`. Specifies bracket segments, delimiter symbols, and field lists. Normative ABNF grammar for array headers: `key[N<delim?>]{fields}:`. Specifies bracket segments, delimiter symbols, and field lists.
**[§7 Strings and Keys](https://github.com/toon-format/spec/blob/main/SPEC.md#7-strings-and-keys)** [§7 Strings and Keys](https://github.com/toon-format/spec/blob/main/SPEC.md#7-strings-and-keys):
Complete quoting rules (when strings MUST be quoted), escape sequences (only `\\`, `\"`, `\n`, `\r`, `\t` are valid), and key encoding requirements. Complete quoting rules (when strings MUST be quoted), escape sequences (only `\\`, `\"`, `\n`, `\r`, `\t` are valid), and key encoding requirements.
**[§8 Objects](https://github.com/toon-format/spec/blob/main/SPEC.md#8-objects)** [§8 Objects](https://github.com/toon-format/spec/blob/main/SPEC.md#8-objects):
Object field encoding (key: value), nesting rules, key order preservation, and empty object handling. Object field encoding (key: value), nesting rules, key order preservation, and empty object handling.
**[§9 Arrays](https://github.com/toon-format/spec/blob/main/SPEC.md#9-arrays)** [§9 Arrays](https://github.com/toon-format/spec/blob/main/SPEC.md#9-arrays):
Covers all array forms: primitive (inline), arrays of objects (tabular), mixed/non-uniform (list), and arrays of arrays. Includes tabular detection requirements. Covers all array forms: primitive (inline), arrays of objects (tabular), mixed/non-uniform (list), and arrays of arrays. Includes tabular detection requirements.
**[§10 Objects as List Items](https://github.com/toon-format/spec/blob/main/SPEC.md#10-objects-as-list-items)** [§10 Objects as List Items](https://github.com/toon-format/spec/blob/main/SPEC.md#10-objects-as-list-items):
Indentation rules for objects appearing in list items (first field on hyphen line, nested object rules). Indentation rules for objects appearing in list items (first field on the hyphen line), including the canonical pattern when the first field is a tabular array (header on the hyphen line, rows at depth +2, sibling fields at depth +1).
**[§11 Delimiters](https://github.com/toon-format/spec/blob/main/SPEC.md#11-delimiters)** [§11 Delimiters](https://github.com/toon-format/spec/blob/main/SPEC.md#11-delimiters):
Delimiter scoping (document vs active), delimiter-aware quoting, and parsing rules for comma/tab/pipe delimiters. Delimiter scoping (document vs active), delimiter-aware quoting, and parsing rules for comma/tab/pipe delimiters.
**[§12 Indentation and Whitespace](https://github.com/toon-format/spec/blob/main/SPEC.md#12-indentation-and-whitespace)** [§12 Indentation and Whitespace](https://github.com/toon-format/spec/blob/main/SPEC.md#12-indentation-and-whitespace):
Encoding requirements (consistent spaces, no tabs in indentation, no trailing spaces/newlines) and decoding rules (strict vs non-strict indentation handling). Encoding requirements (consistent spaces, no tabs in indentation, no trailing spaces/newlines) and decoding rules (strict vs non-strict indentation handling).
### Conformance and Validation ### Conformance and Validation
**[§13 Conformance and Options](https://github.com/toon-format/spec/blob/main/SPEC.md#13-conformance-and-options)** [§13 Conformance and Options](https://github.com/toon-format/spec/blob/main/SPEC.md#13-conformance-and-options):
Defines conformance classes (encoder, decoder, validator), required options, and conformance checklists. Defines conformance classes (encoder, decoder, validator), standardized options, and conformance checklists.
**[§13.4 Key Folding and Path Expansion](https://github.com/toon-format/spec/blob/main/SPEC.md#134-key-folding-and-path-expansion)** [§13.4 Key Folding and Path Expansion](https://github.com/toon-format/spec/blob/main/SPEC.md#134-key-folding-and-path-expansion):
Optional encoder feature (key folding) and decoder feature (path expansion) for collapsing/expanding dotted paths. Specifies safety requirements and conflict resolution. Optional encoder feature (key folding) and decoder feature (path expansion) for collapsing/expanding dotted paths, with deep-merge semantics and strict/non-strict conflict resolution.
**[§14 Strict Mode Errors and Diagnostics](https://github.com/toon-format/spec/blob/main/SPEC.md#14-strict-mode-errors-and-diagnostics-authoritative-checklist)** [§14 Strict Mode Errors and Diagnostics](https://github.com/toon-format/spec/blob/main/SPEC.md#14-strict-mode-errors-and-diagnostics-authoritative-checklist):
**Authoritative checklist** of all strict-mode errors: array count mismatches, syntax errors, indentation errors, structural errors, and path expansion conflicts. **Authoritative checklist** of all strict-mode errors: array count mismatches, syntax errors, indentation errors, structural errors, and path expansion conflicts.
### Implementation Guidance ### Implementation Guidance
**[§19 TOON Core Profile](https://github.com/toon-format/spec/blob/main/SPEC.md#19-toon-core-profile-normative-subset)** [§15 Security Considerations](https://github.com/toon-format/spec/blob/main/SPEC.md#15-security-considerations):
Injection risks, quoting rules, and strict-mode checks relevant to security.
[§16 Internationalization](https://github.com/toon-format/spec/blob/main/SPEC.md#16-internationalization):
Unicode handling and locale-independent number formatting.
[§17 Interoperability and Mappings](https://github.com/toon-format/spec/blob/main/SPEC.md#17-interoperability-and-mappings):
JSON/CSV/YAML mappings and conversion guidance.
[§18 IANA Considerations](https://github.com/toon-format/spec/blob/main/SPEC.md#18-iana-considerations):
Media type registration plans and provisional status.
[§19 TOON Core Profile](https://github.com/toon-format/spec/blob/main/SPEC.md#19-toon-core-profile-normative-subset):
Normative subset of the most common, memory-friendly rules. Useful for minimal implementations. Normative subset of the most common, memory-friendly rules. Useful for minimal implementations.
**[Appendix G: Host Type Normalization Examples](https://github.com/toon-format/spec/blob/main/SPEC.md#appendix-g-host-type-normalization-examples-informative)** [Appendix G: Host Type Normalization Examples](https://github.com/toon-format/spec/blob/main/SPEC.md#appendix-g-host-type-normalization-examples-informative):
Non-normative guidance for Go, JavaScript, Python, and Rust implementations on normalizing language-specific types. Non-normative guidance for Go, JavaScript, Python, and Rust implementations on normalizing language-specific types.
**[Appendix C: Test Suite and Compliance](https://github.com/toon-format/spec/blob/main/SPEC.md#appendix-c-test-suite-and-compliance-informative)** [Appendix C: Test Suite and Compliance](https://github.com/toon-format/spec/blob/main/SPEC.md#appendix-c-test-suite-and-compliance-informative):
Reference test suite at [github.com/toon-format/spec/tree/main/tests](https://github.com/toon-format/spec/tree/main/tests) for validating implementations. Reference test suite at [github.com/toon-format/spec/tree/main/tests](https://github.com/toon-format/spec/tree/main/tests) for validating implementations.
## Spec Sections at a Glance ## Spec Sections at a Glance
@@ -89,28 +105,35 @@ Reference test suite at [github.com/toon-format/spec/tree/main/tests](https://gi
| §7 | Strings, keys, quoting, escaping | Implementing string handling | | §7 | Strings, keys, quoting, escaping | Implementing string handling |
| §8-10 | Objects, arrays, list items | Implementing structure encoding | | §8-10 | Objects, arrays, list items | Implementing structure encoding |
| §11-12 | Delimiters, indentation, whitespace | Implementing formatting and validation | | §11-12 | Delimiters, indentation, whitespace | Implementing formatting and validation |
| §13 | Conformance, options, key folding | Implementing options and features | | §13 | Conformance, options, key folding/path expansion | Implementing options and features |
| §14 | Strict-mode errors | Implementing validators | | §14 | Strict-mode errors | Implementing validators |
| §15-18 | Security, i18n, interoperability, media type | Operational and ecosystem considerations |
| §19 | Core profile | Minimal implementations | | §19 | Core profile | Minimal implementations |
| §20-21 | Versioning, extensibility, IP | Long-term stability and licensing |
## Conformance Checklists ## Conformance Checklists
The spec includes three conformance checklists: The spec includes three conformance checklists:
### [Encoder Checklist (§13.1)](https://github.com/toon-format/spec/blob/main/SPEC.md#131-encoder-conformance-checklist) ### Encoder Checklist (§13.1) <sup>[↗ SPEC.md](https://github.com/toon-format/spec/blob/main/SPEC.md#131-encoder-conformance-checklist)</sup>
Key requirements: Key requirements:
- Produce UTF-8 with LF line endings - Produce UTF-8 with LF line endings
- Use consistent indentation (default 2 spaces, no tabs) - Use consistent indentation (default 2 spaces, no tabs)
- Escape only `\\`, `\"`, `\n`, `\r`, `\t` in quoted strings - Escape only `\\`, `\"`, `\n`, `\r`, `\t` in quoted strings; any other escape is invalid
- Quote strings with active delimiter, colon, or structural characters - Quote strings with active delimiter, colon, or structural characters
- Emit array lengths `[N]` matching actual count - Emit array lengths `[N]` matching actual count
- Preserve object key order - Preserve object key order
- Normalize numbers to non-exponential decimal form - Normalize numbers to non-exponential decimal form
- Convert `-0` to `0`, `NaN`/±Infinity to `null` - Convert `-0` to `0`, `NaN`/±Infinity to `null`
- No trailing spaces or trailing newline - No trailing spaces or trailing newline
- When `keyFolding="safe"` is enabled, folding MUST follow §13.4:
- Only fold IdentifierSegment keys (letters/digits/underscores, no dots),
- Do not introduce collisions with existing sibling keys,
- Do not fold segments that would require quoting.
- When `flattenDepth` is set, folding MUST stop at the configured number of segments (§13.4).
### [Decoder Checklist (§13.2)](https://github.com/toon-format/spec/blob/main/SPEC.md#132-decoder-conformance-checklist) ### Decoder Checklist (§13.2) <sup>[↗ SPEC.md](https://github.com/toon-format/spec/blob/main/SPEC.md#132-decoder-conformance-checklist)</sup>
Key requirements: Key requirements:
- Parse array headers per §6 (length, delimiter, fields) - Parse array headers per §6 (length, delimiter, fields)
@@ -119,15 +142,21 @@ Key requirements:
- Type unquoted primitives: true/false/null → booleans/null, numeric → number, else → string - Type unquoted primitives: true/false/null → booleans/null, numeric → number, else → string
- Enforce strict-mode rules when `strict=true` - Enforce strict-mode rules when `strict=true`
- Preserve array order and object key order - Preserve array order and object key order
- When `expandPaths="safe"` is enabled, expand dotted keys into nested objects per §13.4:
- Split on `.`, only expand when all segments are IdentifierSegments,
- Deep-merge overlapping paths (object + object),
- Do not perform element-wise array merges.
- With `expandPaths="safe"` and `strict=true` (default), MUST error on any expansion conflict (§14.5).
- With `expandPaths="safe"` and `strict=false`, MUST apply deterministic last-write-wins (LWW) conflict resolution (§13.4).
### [Validator Checklist (§13.3)](https://github.com/toon-format/spec/blob/main/SPEC.md#133-validator-conformance-checklist) ### Validator Checklist (§13.3) <sup>[↗ SPEC.md](https://github.com/toon-format/spec/blob/main/SPEC.md#133-validator-conformance-checklist)</sup>
Validators should verify: Validators should verify:
- Structural conformance (headers, indentation, list markers) - Structural conformance (headers, indentation, list markers)
- Whitespace invariants (no trailing spaces/newlines) - Whitespace invariants (no trailing spaces/newlines)
- Delimiter consistency between headers and rows - Delimiter consistency between headers and rows
- Array length counts match declared `[N]` - Array length counts match declared `[N]`
- All strict-mode requirements - All strict-mode requirements (including path-expansion conflicts when enabled)
## Versioning ## Versioning