mirror of
https://github.com/voson-wang/toon.git
synced 2026-01-29 15:24:10 +08:00
docs: add dedicated docs website
This commit is contained in:
@@ -138,11 +138,11 @@ grok-4-fast-non-reasoning
|
||||
|
||||
| Format | Accuracy | Tokens | Correct/Total |
|
||||
| ------ | -------- | ------ | ------------- |
|
||||
| `toon` | 62.9% | 8,780 | 83/132 |
|
||||
| `csv` | 61.4% | 8,528 | 81/132 |
|
||||
| `yaml` | 59.8% | 13,142 | 79/132 |
|
||||
| `json-compact` | 55.3% | 11,465 | 73/132 |
|
||||
| `json-pretty` | 56.1% | 15,158 | 74/132 |
|
||||
| `toon` | 62.9% | 8,779 | 83/132 |
|
||||
| `csv` | 61.4% | 8,527 | 81/132 |
|
||||
| `yaml` | 59.8% | 13,141 | 79/132 |
|
||||
| `json-compact` | 55.3% | 11,464 | 73/132 |
|
||||
| `json-pretty` | 56.1% | 15,157 | 74/132 |
|
||||
| `xml` | 48.5% | 17,105 | 64/132 |
|
||||
|
||||
##### Semi-uniform event logs
|
||||
@@ -273,7 +273,7 @@ grok-4-fast-non-reasoning
|
||||
|
||||
#### What's Being Measured
|
||||
|
||||
This benchmark tests **LLM comprehension and data retrieval accuracy** across different input formats. Each LLM receives formatted data and must answer questions about it (this does **not** test model's ability to generate TOON output).
|
||||
This benchmark tests **LLM comprehension and data retrieval accuracy** across different input formats. Each LLM receives formatted data and must answer questions about it. This does **not** test the model's ability to generate TOON output – only to read and understand it.
|
||||
|
||||
#### Datasets Tested
|
||||
|
||||
|
||||
Reference in New Issue
Block a user