From 38ea864763a549a7999dea468069ef68eb5562a1 Mon Sep 17 00:00:00 2001
From: Johann Schopplich <mail@johannschopplich.com>
Date: Wed, 29 Oct 2025 19:04:04 +0100
Subject: [PATCH] docs: clarify TOON's advantages and optimal data structure

---
 README.md                                | 14 ++++++++------
 benchmarks/results/retrieval-accuracy.md |  4 ++--
 benchmarks/src/report.ts                 |  4 ++--
 3 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/README.md b/README.md
index 33080bd..8d23df1 100644
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@
 
 **Token-Oriented Object Notation** is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage. It's intended for LLM input, not output.
 
-TOON's sweet spot is **uniform complex objects** – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.
+TOON's sweet spot is **uniform arrays of objects** – multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts. For deeply nested or non-uniform data, JSON may be more efficient.
 
 ## Why TOON?
 
@@ -44,6 +44,8 @@ users[2]{id,name,role}:
 
 ## Benchmarks
 
+The benchmarks test datasets that favor TOON's strengths (uniform tabular data). Real-world performance depends heavily on your data structure.
+
 <!-- automd:file src="./benchmarks/results/token-efficiency.md" -->
 
 ### Token Efficiency
@@ -248,7 +250,7 @@ grok-4-fast-non-reasoning
   csv          █████████░░░░░░░░░░░  45.5% (70/154)
 ```
 
-**Advantage:** TOON achieves **69.2% accuracy** (vs JSON's 65.4%) while using **46.3% fewer tokens**.
+**Key tradeoff:** TOON achieves **69.2% accuracy** (vs JSON's 65.4%) while using **46.3% fewer tokens** on these datasets.
 
 <details>
 <summary><strong>Performance by dataset and model</strong></summary>
@@ -348,7 +350,7 @@ This benchmark tests **LLM comprehension and data retrieval accuracy** across di
 
 #### Datasets Tested
 
-Four datasets designed to test different structural patterns:
+Four datasets designed to test different structural patterns (all contain arrays of uniform objects, TOON's optimal format):
 
 1. **Tabular** (100 employee records): Uniform objects with identical fields – optimal for TOON's tabular format.
 2. **Nested** (50 e-commerce orders): Complex structures with nested customer objects and item arrays.
@@ -812,9 +814,9 @@ By default, the decoder validates input strictly:
 
 ## Notes and Limitations
 
-- Format familiarity matters as much as token count. TOON's tabular format requires arrays of objects with identical keys and primitive values only – when this doesn't hold (due to mixed types, non-uniform objects, or nested structures), TOON switches to list format where JSON can be cheaper at scale.
-  - **TOON** is best for uniform complex (but not deeply nested) objects, especially large arrays of such objects.
-  - **JSON** is best for non-uniform data and deeply nested structures.
+- Format familiarity and structure matter as much as token count. TOON's tabular format requires arrays of objects with identical keys and primitive values only. When this doesn't hold (due to mixed types, non-uniform objects, or nested structures), TOON switches to list format where JSON can be more efficient at scale.
+  - **TOON excels at:** Uniform arrays of objects (same fields, primitive values), especially large datasets with consistent structure.
+  - **JSON is better for:** Non-uniform data, deeply nested structures, and objects with varying field sets.
 - **Token counts vary by tokenizer and model.** Benchmarks use a GPT-style tokenizer (cl100k/o200k); actual savings will differ with other models (e.g., [SentencePiece](https://github.com/google/sentencepiece)).
 - **TOON is designed for LLM input** where human readability and token efficiency matter. It's **not** a drop-in replacement for JSON in APIs or storage.
 
diff --git a/benchmarks/results/retrieval-accuracy.md b/benchmarks/results/retrieval-accuracy.md
index c2c5fb1..437efcf 100644
--- a/benchmarks/results/retrieval-accuracy.md
+++ b/benchmarks/results/retrieval-accuracy.md
@@ -32,7 +32,7 @@ grok-4-fast-non-reasoning
   csv          █████████░░░░░░░░░░░  45.5% (70/154)
 ```
 
-**Advantage:** TOON achieves **69.2% accuracy** (vs JSON's 65.4%) while using **46.3% fewer tokens**.
+**Key tradeoff:** TOON achieves **69.2% accuracy** (vs JSON's 65.4%) while using **46.3% fewer tokens** on these datasets.
 
 <details>
 <summary><strong>Performance by dataset and model</strong></summary>
@@ -132,7 +132,7 @@ This benchmark tests **LLM comprehension and data retrieval accuracy** across di
 
 #### Datasets Tested
 
-Four datasets designed to test different structural patterns:
+Four datasets designed to test different structural patterns (all contain arrays of uniform objects, TOON's optimal format):
 
 1. **Tabular** (100 employee records): Uniform objects with identical fields – optimal for TOON's tabular format.
 2. **Nested** (50 e-commerce orders): Complex structures with nested customer objects and item arrays.
diff --git a/benchmarks/src/report.ts b/benchmarks/src/report.ts
index 28bcf66..6c33d4b 100644
--- a/benchmarks/src/report.ts
+++ b/benchmarks/src/report.ts
@@ -83,7 +83,7 @@ export function generateMarkdownReport(
 
   // Build summary comparison
   const summaryComparison = toon && json
-    ? `**Advantage:** TOON achieves **${(toon.accuracy * 100).toFixed(1)}% accuracy** (vs JSON's ${(json.accuracy * 100).toFixed(1)}%) while using **${((1 - toon.totalTokens / json.totalTokens) * 100).toFixed(1)}% fewer tokens**.`
+    ? `**Key tradeoff:** TOON achieves **${(toon.accuracy * 100).toFixed(1)}% accuracy** (vs JSON's ${(json.accuracy * 100).toFixed(1)}%) while using **${((1 - toon.totalTokens / json.totalTokens) * 100).toFixed(1)}% fewer tokens** on these datasets.`
     : ''
 
   // Build performance by dataset
@@ -221,7 +221,7 @@ This benchmark tests **LLM comprehension and data retrieval accuracy** across di
 
 #### Datasets Tested
 
-Four datasets designed to test different structural patterns:
+Four datasets designed to test different structural patterns (all contain arrays of uniform objects, TOON's optimal format):
 
 1. **Tabular** (${tabularSize} employee records): Uniform objects with identical fields – optimal for TOON's tabular format.
 2. **Nested** (${nestedSize} e-commerce orders): Complex structures with nested customer objects and item arrays.