Files
toon/docs/cli/index.md
2025-11-21 16:52:34 +01:00

7.0 KiB
Raw Blame History

Command Line Interface

The @toon-format/cli package provides a command-line interface for encoding JSON to TOON and decoding TOON back to JSON. Use it to analyze token savings before integrating TOON into your application, or to process JSON data through TOON in shell pipelines using stdin/stdout with tools like curl and jq. The CLI supports token statistics, streaming for large datasets, and all encoding options available in the library.

The CLI is built on top of the @toon-format/toon TypeScript implementation and adheres to the latest specification.

Usage

Without Installation

Use npx to run the CLI without installing:

::: code-group

npx @toon-format/cli input.json -o output.toon
npx @toon-format/cli data.toon -o output.json
echo '{"name": "Ada"}' | npx @toon-format/cli

:::

Global Installation

Or install globally for repeated use:

::: code-group

npm install -g @toon-format/cli
pnpm add -g @toon-format/cli
yarn global add @toon-format/cli

:::

After global installation, use the toon command:

toon input.json -o output.toon

Basic Usage

Auto-Detection

The CLI automatically detects the operation based on file extension:

  • .json files → encode (JSON to TOON)
  • .toon files → decode (TOON to JSON)

When reading from stdin, use --encode or --decode flags to specify the operation (defaults to encode).

::: code-group

toon input.json -o output.toon
toon data.toon -o output.json
toon input.json
cat data.json | toon
echo '{"name": "Ada"}' | toon
cat data.toon | toon --decode

:::

Standard Input

Omit the input argument or use - to read from stdin. This enables piping data directly from other commands:

# No argument needed
cat data.json | toon

# Explicit stdin with hyphen (equivalent)
cat data.json | toon -

# Decode from stdin
cat data.toon | toon --decode

Performance

Streaming Output

Both encoding and decoding operations use streaming output, writing incrementally without building the full output string in memory. This makes the CLI efficient for large datasets without requiring additional configuration.

JSON → TOON (Encode)

  • Streams TOON lines to output
  • No full TOON string in memory

TOON → JSON (Decode)

  • Streams JSON tokens to output
  • No full JSON string in memory
# Encode large JSON file with minimal memory usage
toon huge-dataset.json -o output.toon

# Decode large TOON file with minimal memory usage
toon huge-dataset.toon -o output.json

# Process millions of records efficiently via stdin
cat million-records.json | toon > output.toon
cat million-records.toon | toon --decode > output.json

Peak memory usage scales with data depth, not total size. This allows processing arbitrarily large files as long as individual nested structures fit in memory.

::: info Token Statistics When using the --stats flag with encode, the CLI builds the full TOON string once to compute accurate token counts. For maximum memory efficiency on very large files, omit --stats. :::

Options

Option Description
-o, --output <file> Output file path (prints to stdout if omitted)
-e, --encode Force encode mode (overrides auto-detection)
-d, --decode Force decode mode (overrides auto-detection)
--delimiter <char> Array delimiter: , (comma), \t (tab), | (pipe)
--indent <number> Indentation size (default: 2)
--stats Show token count estimates and savings (encode only)
--no-strict Disable strict validation when decoding
--key-folding <mode> Key folding mode: off, safe (default: off)
--flatten-depth <number> Maximum segments to fold (default: Infinity) requires --key-folding safe
--expand-paths <mode> Path expansion mode: off, safe (default: off)

Advanced Examples

Token Statistics

Show token savings when encoding:

toon data.json --stats -o output.toon

This helps you estimate token cost savings before sending data to LLMs.

Example output:

✔ Encoded data.json → output.toon

 Token estimates: ~15,145 (JSON) → ~8,745 (TOON)
✔ Saved ~6,400 tokens (-42.3%)

Alternative Delimiters

TOON supports three delimiters: comma (default), tab, and pipe. Alternative delimiters can provide additional token savings in specific contexts.

::: code-group

toon data.json --delimiter "\t" -o output.toon
toon data.json --delimiter "|" -o output.toon

:::

Tab delimiter example:

::: code-group

items[2	]{id	name	qty	price}:
  A1	Widget	2	9.99
  B2	Gadget	1	14.5
items[2]{id,name,qty,price}:
  A1,Widget,2,9.99
  B2,Gadget,1,14.5

:::

Tip

Tab delimiters often tokenize more efficiently than commas and reduce the need for quote-escaping. Use --delimiter "\t" for maximum token savings on large tabular data.

Lenient Decoding

Skip validation for faster processing:

toon data.toon --no-strict -o output.json

Lenient mode (--no-strict) disables strict validation checks like array count matching, indentation multiples, and delimiter consistency. Use this when you trust the input and want faster decoding.

Stdin Workflows

The CLI integrates seamlessly with Unix pipes and other command-line tools:

# Convert API response to TOON
curl https://api.example.com/data | toon --stats

# Process large dataset
cat large-dataset.json | toon --delimiter "\t" > output.toon

# Chain with jq
jq '.results' data.json | toon > filtered.toon

Key Folding

Collapse nested wrapper chains to reduce tokens (since spec v1.5):

::: code-group

toon input.json --key-folding safe -o output.toon
toon input.json --key-folding safe --flatten-depth 2 -o output.toon

:::

Example:

For data like:

{
  "data": {
    "metadata": {
      "items": ["a", "b"]
    }
  }
}

With --key-folding safe, output becomes:

data.metadata.items[2]: a,b

Instead of:

data:
  metadata:
    items[2]: a,b

Path Expansion

Reconstruct nested structure from folded keys when decoding:

toon data.toon --expand-paths safe -o output.json

This pairs with --key-folding safe for lossless round-trips.

Round-Trip Workflow

# Encode with folding
toon input.json --key-folding safe -o compressed.toon

# Decode with expansion (restores original structure)
toon compressed.toon --expand-paths safe -o output.json

# Verify round-trip
diff input.json output.json

Combined Options

Combine multiple options for maximum efficiency:

# Key folding + tab delimiter + stats
toon data.json --key-folding safe --delimiter "\t" --stats -o output.toon