Commit Graph

14 Commits

Author SHA1 Message Date
Johann Schopplich
840626dc90 feat: minor fixes for spec v1.4 compliance 2025-11-05 19:04:00 +01:00
Wind
e414ca3671 fix: handle empty list items and nested objects in list items (#65)
* fix: support quoted keys with array syntax

Fixes parsing of quoted keys followed by array syntax like:
"x-codeSamples"[1]{lang,label,source}:

Previously, parseArrayHeaderLine would skip any line starting with
a quoted key. This caused large OpenAPI specs (like Hetzner Cloud API)
to fail decoding.

Changes:
- Modified parseArrayHeaderLine to handle quoted keys
- Added logic to find bracket start after closing quote
- Unescape quoted keys properly
- Added 3 test cases for the new functionality

Closes #62

* fix: handle empty list items and nested objects in list items

This commit fixes two critical decoder bugs that prevented complex
OpenAPI specs (like DigitalOcean's 638 schemas) from being decoded:

1. Empty list items: Items encoded as just `-` (without space) were
   not recognized. The decoder only checked for `LIST_ITEM_PREFIX = '- '`.
   Fixed by adding check for both `- ` and `-` patterns.

2. Nested objects in list items: When a list item contains an object
   with nested properties (e.g., `allOf[2]: - properties: state: ...`),
   the decoder was looking for nested content at the wrong depth level.
   List items add one level of indentation, so nested content should be
   at baseDepth + 2, not baseDepth + 1.

   Fixed by creating `decodeKeyValueForListItem()` that correctly handles
   the extra nesting while maintaining proper followDepth for siblings.

Changes:
- Added `decodeKeyValueForListItem()` function to handle list item nesting
- Updated `decodeObjectFromListItem()` to use new function
- Added empty item detection in `decodeListArray()`
- Added comprehensive unit tests for both bugs
- Added integration test with real DigitalOcean OpenAPI spec (638 schemas)
- Gitignored large fixture files, added README with download instructions

Tests:
- 5 new unit tests in list-item-bugs.test.ts
- 1 integration test in digitalocean-decode.test.ts (skips if fixture missing)
- All 309 existing tests still pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* perf: calculate depth on demand

* chore: move tests to test suite

* chore: test against new tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Johann Schopplich <mail@johannschopplich.com>
2025-11-03 08:18:14 +01:00
Johann Schopplich
8977c8c7d6 feat: use language-agnostic test suite 2025-11-02 18:31:06 +01:00
Johann Schopplich
0710bd19e7 feat!: publish to @toon-format/toon and @toon-format/cli 2025-11-01 16:53:41 +01:00
Johann Schopplich
8567df9131 fix(cli): inline tokenx 2025-11-01 00:37:11 +01:00
SangheeSon
2b882870f7 feat(cli): add --stats flag to show token savings (#51)
* feat(cli): add --stats flag to show token efficiency

- Add --stats boolean flag to display token count comparison
- Calculate approximate tokens using char length / 4 heuristic
- Show JSON vs TOON token counts with savings percentage
- Opt-in feature, default behavior unchanged

* feat: use tokenx for more accurate estimates

---------

Co-authored-by: Johann Schopplich <mail@johannschopplich.com>
2025-11-01 00:35:54 +01:00
Johann Schopplich
753ee2cefd docs: add table of contents 2025-10-31 08:56:42 +01:00
Johann Schopplich
b93714e9b9 refactor: move gpt-tokenizer to benchmarks 2025-10-30 09:27:11 +01:00
Andreas Partsch
80acc9d4fe feat: add cli (#34)
* feat: add cli for toon

* docs: use npx in the readme

* feat: overhaul and refactor

---------

Co-authored-by: Johann Schopplich <mail@johannschopplich.com>
2025-10-30 08:08:08 +01:00
Johann Schopplich
ecf578a7dc text(accuracy): add Grok-4-fast, remove default temperature 2025-10-28 22:54:00 +01:00
Johann Schopplich
67c0df8cb0 docs: overhaul retrieval accuracy benchmark 2025-10-28 20:22:43 +01:00
Johann Schopplich
77696ce932 docs: benchmarks for XML format 2025-10-27 14:50:26 +01:00
Johann Schopplich
3c840259fe test: add LLM retrieval accuracy tests 2025-10-27 11:48:33 +01:00
Johann Schopplich
f105551c3e chore: initial commit 2025-10-22 20:16:02 +02:00