mirror of
https://github.com/voson-wang/toon.git
synced 2026-01-29 23:34:10 +08:00
docs: add note on upcoming retrieval accuracy benchmarks
This commit is contained in:
@@ -4,6 +4,9 @@
|
||||
|
||||
**Token-Oriented Object Notation** is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage.
|
||||
|
||||
In other words, if YAML and CSV had a baby, optimized for LLM contexts.
|
||||
TOON borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.
|
||||
|
||||
> [!TIP]
|
||||
> Wrap your JSON in `encode()` before sending it to LLMs and save ~1/2 of the token cost for structured data!
|
||||
|
||||
@@ -28,6 +31,9 @@ users[2]{id,name,role}:
|
||||
2,Bob,user
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> I built TOON to save tokens when sending large datasets to LLMs at work, where I tend to have uniform arrays of objects that benefit from the tabular format.
|
||||
|
||||
## Key Features
|
||||
|
||||
- 💸 **Token-efficient:** typically 30–60% fewer tokens than JSON
|
||||
@@ -38,6 +44,9 @@ users[2]{id,name,role}:
|
||||
|
||||
## Token Benchmarks
|
||||
|
||||
> [!NOTE]
|
||||
> Benchmarks for LLM accuracy and retrieval are currently in development.
|
||||
|
||||
<!-- automd:file src="./docs/benchmarks.md" -->
|
||||
|
||||
| Example | JSON | TOON | Tokens Saved | Reduction |
|
||||
|
||||
Reference in New Issue
Block a user