MarkDone
Benchmark

The Token Tax: what your JSON really costs in every LLM prompt

Published June 2026 · 6 min read

Every byte you send to a language model is metered. The prompt is tokenized, the tokens are counted, and you are billed — on every single call. So here is a question most teams never ask: is the data format itself quietly inflating the bill?

It is. And the gap is larger than most people guess. We took one dataset, wrote it out in five common formats, and ran each through the exact tokenizer that GPT-4o and the GPT-5 family use (o200k_base — the same one behind MarkDone's token counter). No estimates, no rules of thumb. Real counts.

The experiment

The dataset is 50 uniform records — the shape almost every app actually sends: an array of objects with the same keys. Each record has an id, name, role, email, an active boolean, and a numeric score. Identical data, five encodings, one tokenizer.

The results

FormatTokensvs. pretty JSON
JSON (pretty, 2-space)2,529
YAML1,902−25%
JSON (minified)1,475−42%
TOON884−65%
CSV779−69%

50 records, tokenized with o200k_base (GPT-4o / GPT-5 family). Reproduce it with your own data in the token counter.

Three things jump out:

Why JSON is so expensive

JSON repeats every key on every record. With 50 rows, the strings "id", "name", "role", "email", "active", and "score" are tokenized 50 times each — plus the braces, quotes, colons, and commas that wrap every value. Multiply that by your row count and the structure costs more than the data.

Here is the same data in JSON:

{
  "users": [
    { "id": 1, "name": "User 1", "role": "designer",
      "email": "user1@example.com", "active": true, "score": 7 },
    { "id": 2, "name": "User 2", "role": "manager",
      "email": "user2@example.com", "active": true, "score": 15 }
  ]
}

Every key, every row. The punctuation alone is a meaningful share of the tokens.

And in TOON, which declares the fields once in a header and then streams the rows:

users[50]{id,name,role,email,active,score}:
  1,User 1,designer,user1@example.com,true,7
  2,User 2,manager,user2@example.com,true,15

The header carries the schema; the rows carry only data. Types and structure survive — unlike a bare CSV.

The money

Token counts are abstract until you attach a price. Take a current flagship model: as of mid-2026, GPT-5.5 lists at $5.00 per million input tokens (GPT-5.4 sits at $2.50). Now imagine an agent or feature that includes this 50-row table in its prompt on every call, 100,000 calls a month:

FormatTokens/callCost / month
JSON (pretty)2,529$1,265
JSON (minified)1,475$738
TOON884$442

Input tokens only, at $5 / 1M (GPT-5.5 standard, mid-2026). Output and caching are separate.

Switching that one payload from pretty JSON to TOON saves ~$823 every month — close to $9,900 a year — on a single repeated prompt. Scale it across an agent that calls a model in a loop, and the format is no longer a footnote. It is a line item.

When TOON wins (and when it doesn't)

The savings come from repetition. The more your data looks like a table — a uniform array of objects with the same keys — the more TOON's write-the-header-once trick pays off. That is exactly the shape of most records, logs, search results, and database rows you feed a model.

For deeply nested, one-off, or highly irregular data, the gap narrows: there is no repeated header to amortize. In those cases, the free win is still there — minify your JSON and stop paying the whitespace tax.

Rule of thumb: if your prompt contains a list of similar objects, you are very likely overpaying. Tabular data is where the token tax is highest — and where it is easiest to cut.
Measure your own token tax Count tokens for any text, or convert JSON to TOON — both run entirely in your browser, no upload.
JSON to TOON Token counter

New to the format? Start with what is TOON? for a plain-English tour, or how to convert JSON to TOON for the step-by-step. Comparing config formats instead? See YAML vs JSON.