Background
A community .NET implementation ships four features not in the current v3.0 spec. We are proposing them for inclusion in v4.0 following the RFC process described in CONTRIBUTING.md.
Proposed Features
1. Mixed Columnar Arrays (objectArrayLayout: "columnar") — MAJOR
Object arrays with both primitive and complex fields are currently forced into the expanded list form (§9.4), losing the token efficiency of the tabular header. This proposal adds §9.3.2: a columnar header carries the primitive fields, and complex fields are emitted as spill lines after each row.
Before (v3.x, expanded list):
```
users[2]:
- id: 1
name: Alice
address:
city: NY
- id: 2
name: Bob
address:
city: LA
```
After (v4.0 columnar):
```
users[2]{id,name}:
1,Alice
address:
city: NY
2,Bob
address:
city: LA
```
Token savings scale with array size and number of primitive columns. The existing header syntax is reused unchanged — the only new rule is that lines at depth +2 following a row belong to that row's object.
2. `ignoreNullOrEmpty` (boolean, default `true`) — MINOR standalone, bundled here
Suppress null and empty-string fields from output. In columnar arrays, suppress entire columns where every row's value is null or empty string. Lossy; documented as such in §13.6.
3. `excludeEmptyArrays` (boolean, default `true`) — MINOR standalone, bundled here
Suppress zero-length array fields from output. Lossy; documented as such in §13.7.
4. Binary/byte array guidance (non-normative) — Appendix G.6
Guidance for typed language implementations encoding `byte[]` / binary data: Base64 string (recommended default) or numeric array. Both are valid TOON.
Draft PR
Full spec text (§9.3.2, §13.5–13.7, Appendix G.6), updated conformance checklists, and 24 new test fixtures are in the draft PR: #47
Discussion Questions
- Decoder auto-detection vs. explicit option — should v4.0 decoders always handle columnar output automatically (detect via spill-line presence after first row), or should a decoder option be required?
- Lossy defaults — should `ignoreNullOrEmpty` and `excludeEmptyArrays` default to `false` for safer round-trips by default?
- Splitting MINOR items — should `ignoreNullOrEmpty` and `excludeEmptyArrays` be extracted into a separate v3.1 PR since they function independently of the columnar layout?
Specification Principles Addressed
- Token Efficiency — columnar form recovers tabular header compactness for mixed-field arrays
- LLM-Friendly Structure — field names stay in the header; structure is explicit and length-declared
- Simplicity — reuses existing header syntax; the only new rule is spill-line depth semantics
- Backward Compatibility — `objectArrayLayout="auto"` default preserves all v3.x encoding behavior unchanged
Background
A community .NET implementation ships four features not in the current v3.0 spec. We are proposing them for inclusion in v4.0 following the RFC process described in CONTRIBUTING.md.
Proposed Features
1. Mixed Columnar Arrays (
objectArrayLayout: "columnar") — MAJORObject arrays with both primitive and complex fields are currently forced into the expanded list form (§9.4), losing the token efficiency of the tabular header. This proposal adds §9.3.2: a columnar header carries the primitive fields, and complex fields are emitted as spill lines after each row.
Before (v3.x, expanded list):
```
users[2]:
name: Alice
address:
city: NY
name: Bob
address:
city: LA
```
After (v4.0 columnar):
```
users[2]{id,name}:
1,Alice
address:
city: NY
2,Bob
address:
city: LA
```
Token savings scale with array size and number of primitive columns. The existing header syntax is reused unchanged — the only new rule is that lines at depth +2 following a row belong to that row's object.
2. `ignoreNullOrEmpty` (boolean, default `true`) — MINOR standalone, bundled here
Suppress null and empty-string fields from output. In columnar arrays, suppress entire columns where every row's value is null or empty string. Lossy; documented as such in §13.6.
3. `excludeEmptyArrays` (boolean, default `true`) — MINOR standalone, bundled here
Suppress zero-length array fields from output. Lossy; documented as such in §13.7.
4. Binary/byte array guidance (non-normative) — Appendix G.6
Guidance for typed language implementations encoding `byte[]` / binary data: Base64 string (recommended default) or numeric array. Both are valid TOON.
Draft PR
Full spec text (§9.3.2, §13.5–13.7, Appendix G.6), updated conformance checklists, and 24 new test fixtures are in the draft PR: #47
Discussion Questions
Specification Principles Addressed