Johann Schopplich
|
67169f6f9f
|
docs: switch benchmark order
|
2025-11-09 11:38:14 +01:00 |
|
Johann Schopplich
|
b4655b01af
|
chore(benchmarks): fix CSV question count in accuracy reports
|
2025-11-07 21:31:15 +01:00 |
|
Johann Schopplich
|
acca69c64a
|
chore(benchmarks): replace LLM-as-judge, new structural validation
|
2025-11-07 21:28:21 +01:00 |
|
Johann Schopplich
|
c6ba6446f5
|
chore(benchmarks): finalize structure-awareness run
|
2025-11-07 10:33:46 +01:00 |
|
Johann Schopplich
|
89df613059
|
chore(benchmarks): add structure-awareness questions
|
2025-11-07 09:03:51 +01:00 |
|
Johann Schopplich
|
54433de930
|
chore: split token efficiency benchmark into mixed/flat tracks
|
2025-11-06 22:17:18 +01:00 |
|
Johann Schopplich
|
e22884308b
|
chore(benchmarks): fix undefined in GitHub question generation
|
2025-11-06 16:06:31 +01:00 |
|
Johann Schopplich
|
a9d52fc69b
|
chore: more work on benchmarks
|
2025-11-06 15:51:31 +01:00 |
|
Johann Schopplich
|
bc711ccecf
|
test(benchmark): overhaul generation
|
2025-11-06 14:45:44 +01:00 |
|
Johann Schopplich
|
af17efe128
|
docs: add accuracy per 1k tokens report (closes #72)
|
2025-11-05 08:21:57 +01:00 |
|
Johann Schopplich
|
3472081b40
|
docs: clarify CSV vs TOON use cases
|
2025-11-04 18:12:19 +01:00 |
|
Johann Schopplich
|
c1527dcf80
|
chore: fix type issue
|
2025-11-02 18:34:00 +01:00 |
|
Johann Schopplich
|
8977c8c7d6
|
feat: use language-agnostic test suite
|
2025-11-02 18:31:06 +01:00 |
|
Johann Schopplich
|
5f09a14c61
|
chore: fix type issues
|
2025-11-01 17:15:37 +01:00 |
|
Johann Schopplich
|
753ee2cefd
|
docs: add table of contents
|
2025-10-31 08:56:42 +01:00 |
|
Johann Schopplich
|
7317b869b1
|
docs: update benchmark README
|
2025-10-30 17:38:00 +01:00 |
|
Johann Schopplich
|
983728e913
|
refactor: progress bar configuration
|
2025-10-30 15:24:22 +01:00 |
|
Johann Schopplich
|
fb43bdf527
|
docs: adjust padding for benchmark comparison
|
2025-10-30 15:19:16 +01:00 |
|
Johann Schopplich
|
2c4f3c4362
|
test: add benchmarks for compact vs. pretty JSON
|
2025-10-30 15:02:51 +01:00 |
|
Johann Schopplich
|
38ea864763
|
docs: clarify TOON's advantages and optimal data structure
|
2025-10-29 19:04:04 +01:00 |
|
Johann Schopplich
|
45604b06e8
|
feat: decode method (#10)
|
2025-10-29 07:42:15 +01:00 |
|
Johann Schopplich
|
7db91398fe
|
docs(benchmark): add YAML format support
|
2025-10-29 06:42:40 +01:00 |
|
Johann Schopplich
|
e757746351
|
docs(accuracy): highlight toon in perf table
|
2025-10-28 23:08:47 +01:00 |
|
Johann Schopplich
|
ecf578a7dc
|
text(accuracy): add Grok-4-fast, remove default temperature
|
2025-10-28 22:54:00 +01:00 |
|
Johann Schopplich
|
67c0df8cb0
|
docs: overhaul retrieval accuracy benchmark
|
2025-10-28 20:22:43 +01:00 |
|
Johann Schopplich
|
52dc9c4b3f
|
docs: clarify retrieval accuracy metrics
|
2025-10-28 08:39:43 +01:00 |
|
Johann Schopplich
|
cdd4a20c67
|
refactor: benchmarks code style
|
2025-10-28 08:02:57 +01:00 |
|
Johann Schopplich
|
352e936370
|
docs: update notes & limitations guide
|
2025-10-28 07:44:35 +01:00 |
|
Johann Schopplich
|
8b9924ff05
|
refactor: token efficiency benchmark code
|
2025-10-28 07:42:49 +01:00 |
|
Johann Schopplich
|
b839d35ad0
|
docs: how the benchmarks work section
|
2025-10-27 20:35:43 +01:00 |
|
Johann Schopplich
|
4ec7e84f5f
|
refactor: shared utils for benchmark scripts
|
2025-10-27 17:37:27 +01:00 |
|
Johann Schopplich
|
7b76acde31
|
docs: add benchmarks for gemini-2.5-flash
|
2025-10-27 16:02:51 +01:00 |
|
Johann Schopplich
|
77696ce932
|
docs: benchmarks for XML format
|
2025-10-27 14:50:26 +01:00 |
|
Johann Schopplich
|
b9f54ba585
|
docs: update benchmark reports' readability
|
2025-10-27 14:18:37 +01:00 |
|
Johann Schopplich
|
05b3d43023
|
test: refactor accuracy benchmark generation
|
2025-10-27 14:07:20 +01:00 |
|
Johann Schopplich
|
1a5e6199ac
|
test: update retrieval accuracy benchmarks
|
2025-10-27 13:45:48 +01:00 |
|
Johann Schopplich
|
b2c58d2b97
|
chore: fix linting issues
|
2025-10-27 11:49:40 +01:00 |
|
Johann Schopplich
|
3c840259fe
|
test: add LLM retrieval accuracy tests
|
2025-10-27 11:48:33 +01:00 |
|