16 Commits

Author SHA1 Message Date
Johann Schopplich
9a6125424c docs: update benchmarks for v3 list item syntax 2025-11-24 16:35:44 +01:00
Johann Schopplich
11a089bb86 docs: update token count 2025-11-24 09:13:22 +01:00
Johann Schopplich
acca69c64a chore(benchmarks): replace LLM-as-judge, new structural validation 2025-11-07 21:28:21 +01:00
Johann Schopplich
c6ba6446f5 chore(benchmarks): finalize structure-awareness run 2025-11-07 10:33:46 +01:00
Johann Schopplich
54433de930 chore: split token efficiency benchmark into mixed/flat tracks 2025-11-06 22:17:18 +01:00
Johann Schopplich
bc711ccecf test(benchmark): overhaul generation 2025-11-06 14:45:44 +01:00
Johann Schopplich
af17efe128 docs: add accuracy per 1k tokens report (closes #72) 2025-11-05 08:21:57 +01:00
Johann Schopplich
3472081b40 docs: clarify CSV vs TOON use cases 2025-11-04 18:12:19 +01:00
Johann Schopplich
fb43bdf527 docs: adjust padding for benchmark comparison 2025-10-30 15:19:16 +01:00
Johann Schopplich
2c4f3c4362 test: add benchmarks for compact vs. pretty JSON 2025-10-30 15:02:51 +01:00
Johann Schopplich
7db91398fe docs(benchmark): add YAML format support 2025-10-29 06:42:40 +01:00
Johann Schopplich
67c0df8cb0 docs: overhaul retrieval accuracy benchmark 2025-10-28 20:22:43 +01:00
Johann Schopplich
7b76acde31 docs: add benchmarks for gemini-2.5-flash 2025-10-27 16:02:51 +01:00
Johann Schopplich
b9f54ba585 docs: update benchmark reports' readability 2025-10-27 14:18:37 +01:00
Johann Schopplich
1a5e6199ac test: update retrieval accuracy benchmarks 2025-10-27 13:45:48 +01:00
Johann Schopplich
3c840259fe test: add LLM retrieval accuracy tests 2025-10-27 11:48:33 +01:00