chore: split token efficiency benchmark into mixed/flat tracks

This commit is contained in:
Johann Schopplich
2025-11-06 22:17:18 +01:00
parent e22884308b
commit 54433de930
13 changed files with 567 additions and 1830 deletions

View File

@@ -1,79 +1,81 @@
## Mixed-Structure Track
#### Mixed-Structure Track
Datasets with nested or semi-uniform structures. CSV excluded as it cannot properly represent these structures.
```
🛒 E-commerce orders with nested structures [eligibility: 33%]
toon ▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░░░ 58,528 tokens
vs JSON (37.9%) 94,207
vs JSON compact (+0.9%) 57,979
vs YAML (17.8%) 71,223
vs XML (45.2%) 106,720
🛒 E-commerce orders with nested structures ┊ Tabular: 33%
TOON █████████████░░░░░░░ 72,743 tokens
├─ vs JSON (33.1%) 108,731 tokens
├─ vs JSON compact (+5.5%) 68,936 tokens
├─ vs YAML (14.1%) 84,724 tokens
└─ vs XML (40.5%) 122,313 tokens
🧾 Semi-uniform event logs [eligibility: 50%]
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░ 154,419 tokens
vs JSON (15.0%) 181,592
vs JSON compact (+19.9%) 128,836
vs YAML (0.9%) 155,749
vs XML (25.1%) 206,271
🧾 Semi-uniform event logs ┊ Tabular: 50%
TOON █████████████████░░░ 153,223 tokens
├─ vs JSON (15.0%) 180,196 tokens
├─ vs JSON compact (+19.9%) 127,740 tokens
├─ vs YAML (0.8%) 154,514 tokens
└─ vs XML (25.2%) 204,800 tokens
🧩 Deeply nested configuration [eligibility: 0%]
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░ 630 tokens
vs JSON (31.4%) 918
vs JSON compact (+11.9%) 563
vs YAML (6.4%) 673
vs XML (37.4%) 1,007
🧩 Deeply nested configuration ┊ Tabular: 0%
TOON ██████████████░░░░░░ 631 tokens
├─ vs JSON (31.3%) 919 tokens
├─ vs JSON compact (+11.9%) 564 tokens
├─ vs YAML (6.2%) 673 tokens
└─ vs XML (37.4%) 1,008 tokens
─────────────────────────────────────────────────────────────────────────────────
Total
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░ 213,577 tokens
vs JSON (22.8%) 276,717
vs JSON compact (+14.0%) 187,378
vs YAML (6.2%) 227,645
vs XML (32.0%) 313,998
──────────────────────────────────── Total ────────────────────────────────────
TOON ████████████████░░░░ 226,597 tokens
├─ vs JSON (21.8%) 289,846 tokens
├─ vs JSON compact (+14.9%) 197,240 tokens
├─ vs YAML (5.5%) 239,911 tokens
└─ vs XML (30.9%) 328,121 tokens
```
## Flat-Only Track
#### Flat-Only Track
Datasets with flat tabular structures where CSV is applicable.
```
👥 Uniform employee records (TOON optimal format) [eligibility: 100%]
csv ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░ 46,968 tokens
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 49,841 tokens (+5.8% vs CSV)
vs JSON (60.7%) 126,886
vs JSON compact (36.8%) 78,882
vs YAML (50.0%) 99,743
vs XML (66.0%) 146,465
👥 Uniform employee records ┊ Tabular: 100%
CSV ███████████████████░ 46,956 tokens
TOON ████████████████████ 49,827 tokens (+6.1% vs CSV)
├─ vs JSON (60.7%) 126,854 tokens
├─ vs JSON compact (36.8%) 78,850 tokens
├─ vs YAML (50.0%) 99,701 tokens
└─ vs XML (66.0%) 146,440 tokens
📈 Time-series analytics data [eligibility: 100%]
csv ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░ 8,382 tokens
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 9,114 tokens (+8.0% vs CSV)
vs JSON (59.0%) 22,244
vs JSON compact (35.9%) 14,210
vs YAML (49.0%) 17,857
vs XML (65.8%) 26,615
📈 Time-series analytics data ┊ Tabular: 100%
CSV ██████████████████░░ 8,396 tokens
TOON ████████████████████ 9,128 tokens (+8.7% vs CSV)
├─ vs JSON (59.0%) 22,258 tokens
├─ vs JSON compact (35.8%) 14,224 tokens
├─ vs YAML (48.9%) 17,871 tokens
└─ vs XML (65.7%) 26,629 tokens
⭐ Top 100 GitHub repositories [eligibility: 100%]
csv ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░ 8,513 tokens
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 8,745 tokens (+2.7% vs CSV)
vs JSON (42.3%) 15,145
vs JSON compact (23.7%) 11,455
vs YAML (33.4%) 13,129
vs XML (48.8%) 17,095
⭐ Top 100 GitHub repositories ┊ Tabular: 100%
CSV ███████████████████░ 8,513 tokens
TOON ████████████████████ 8,745 tokens (+2.7% vs CSV)
├─ vs JSON (42.3%) 15,145 tokens
├─ vs JSON compact (23.7%) 11,455 tokens
├─ vs YAML (33.4%) 13,129 tokens
└─ vs XML (48.8%) 17,095 tokens
─────────────────────────────────────────────────────────────────────────────────
Total
csv ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░ 63,863 tokens
toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 67,700 tokens (+5.7% vs CSV)
vs JSON (58.8%) 164,275
vs JSON compact (35.2%) 104,547
vs YAML (48.2%) 130,729
vs XML (64.4%) 190,175
──────────────────────────────────── Total ────────────────────────────────────
CSV ███████████████████░ 63,865 tokens
TOON ████████████████████ 67,700 tokens (+6.0% vs CSV)
├─ vs JSON (58.8%) 164,257 tokens
├─ vs JSON compact (35.2%) 104,529 tokens
├─ vs YAML (48.2%) 130,701 tokens
└─ vs XML (64.4%) 190,164 tokens
```
<details>
<summary><strong>View detailed examples</strong></summary>
@@ -81,64 +83,64 @@ toon ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
**Savings:** 13,130 tokens (59.0% reduction vs JSON)
**JSON** (22,244 tokens):
**JSON** (22,258 tokens):
```json
{
"metrics": [
{
"date": "2025-01-01",
"views": 4324,
"clicks": 146,
"conversions": 21,
"revenue": 3834.57,
"bounceRate": 0.4
"views": 7708,
"clicks": 595,
"conversions": 69,
"revenue": 15369.93,
"bounceRate": 0.35
},
{
"date": "2025-01-02",
"views": 6248,
"clicks": 407,
"conversions": 22,
"revenue": 2936.12,
"bounceRate": 0.62
"views": 5894,
"clicks": 381,
"conversions": 21,
"revenue": 2112.12,
"bounceRate": 0.3
},
{
"date": "2025-01-03",
"views": 7382,
"clicks": 270,
"conversions": 24,
"revenue": 6825.19,
"bounceRate": 0.7
"views": 6835,
"clicks": 422,
"conversions": 35,
"revenue": 4525.73,
"bounceRate": 0.5
},
{
"date": "2025-01-04",
"views": 4586,
"clicks": 267,
"conversions": 24,
"revenue": 2391.11,
"bounceRate": 0.64
"views": 5325,
"clicks": 305,
"conversions": 22,
"revenue": 2445.3,
"bounceRate": 0.44
},
{
"date": "2025-01-05",
"views": 6171,
"clicks": 227,
"conversions": 12,
"revenue": 3430.1,
"bounceRate": 0.39
"views": 2974,
"clicks": 61,
"conversions": 6,
"revenue": 956.57,
"bounceRate": 0.47
}
]
}
```
**TOON** (9,114 tokens):
**TOON** (9,128 tokens):
```
metrics[5]{date,views,clicks,conversions,revenue,bounceRate}:
2025-01-01,4324,146,21,3834.57,0.4
2025-01-02,6248,407,22,2936.12,0.62
2025-01-03,7382,270,24,6825.19,0.7
2025-01-04,4586,267,24,2391.11,0.64
2025-01-05,6171,227,12,3430.1,0.39
2025-01-01,7708,595,69,15369.93,0.35
2025-01-02,5894,381,21,2112.12,0.3
2025-01-03,6835,422,35,4525.73,0.5
2025-01-04,5325,305,22,2445.3,0.44
2025-01-05,2974,61,6,956.57,0.47
```
---