TOON: Token-Oriented Object Notation, A Beginners Guide
This article introduces Token-Oriented Object Notation (TOON) for engineers, data scientists, and technical enthusiasts who want a compact, LLM-friendly alternative to JSON when exchanging structured data with large language models.
Overview
TOON is a human-readable serialization format that represents JSON-equivalent structures using token-efficient, tabular encodings. It is designed specifically for payloads dominated by uniform arrays of objects, where declaring keys once and streaming value rows can significantly shorten the input a model must consume. TOON aims to preserve the semantics of JSON while optimizing the surface form for tokenized models.
Why It’s Useful
TOON reduces the number of tokens required to describe structured data, which directly lowers processing costs and latency when interacting with LLMs. Since models like GPT and Claude tokenize text rather than parse JSON natively, repeated key names in traditional JSON add unnecessary overhead. By encoding repeated structures in a compact, tabular form, TOON improves both efficiency and interpretability.
This makes TOON particularly advantageous for:
- Batch inference or fine-tuning datasets where large uniform objects (e.g., conversation logs, telemetry, or feature tables) are common.
- Real-time LLM applications that stream structured data between systems and models under strict token limits.
- Interoperability testing and visualization, since TOON remains human-readable and reversible back to JSON without loss of meaning.
In short, TOON provides a bridge between data-rich engineering contexts and token-constrained language models, making structured exchanges faster, cheaper, and easier to reason about.
Core capabilities
- Tokenefficient array encoding: TOON reduces repetition by declaring object keys once in a header and streaming rows of values in the declared order, which shortens sequences for uniform records.
- Headerdriven scoping: Each array uses an explicit header that can include the array name, a length marker, and the ordered keys, making row boundaries and expected fields explicit.
- Delimiter and quoting options: The format supports configurable delimiters and quoting so rows can include values with whitespace or delimiter characters without ambiguous parsing.
- Type normalization and mapping: TOON defines how common value types map back to JSON primitives and how nonJSON values are handled to ensure consistent decode behavior.
- Key folding for brevity: For cases where wrapper objects add nesting without useful semantic weight, TOON offers optional key folding (dotted paths) to compress singlekey chains while allowing expansion on decode.
- Readable, handauthorable syntax: The combination of a concise header and CSVstyle rows make small datasets easy to inspect or author by hand while remaining straightforward for automated encoders and decoders.

Indepth example
Start with a simple JSON array of two product objects:
{ "items": [ {"sku":"A1","qty":2,"price":9.99}, {"sku":"B2","qty":1,"price":14.5} ] }
TOON represents the same payload as a single header and two rows:
items[2]{sku,qty,price}: A1,2,9.99 B2,1,14.5
The header names the array, declares its length, and lists keys in order; each row then supplies values in that key order using the chosen delimiter. This pattern removes perobject key names and produces a shorter token stream in many tokenizers and models for uniformly shaped records. Quoting and alternate delimiters can be used if field values contain commas or spaces.
Limitations and constraints
TOON is optimized for uniform arrays of primitivefield objects and is not primarily intended for deeply nested, irregular, or highly heterogeneous JSON documents. In such cases, the verbosity or structural needs of standard JSON or other hierarchical formats may be preferable. Token savings depend on dataset shape, tokenizer, and model; empirical benchmarks are required to quantify benefits for any particular workload. The specification and tooling are evolving; implementations and integrations may change as the project matures.
Closing
TOON provides a pragmatic, schemalike approach to representing tabular JSON data for LLM prompts and model I/O, trading structural repetition for token efficiency while preserving roundtrip equivalence to JSON when decoded.
For more information on TOON on any of our IT Management services, contact our team at DysrupIT.
References
- TOON GitHub repository — specification, examples, SDKs, and benchmarks: https://github.com/toon-format/toon

DysrupIT
