Skip to content

feat(layout): expose decoder layout metadata behind cargo feature#66

Merged
shreyasbhat0 merged 2 commits into
toon-format:mainfrom
apardawala:feat/layout-metadata
May 22, 2026
Merged

feat(layout): expose decoder layout metadata behind cargo feature#66
shreyasbhat0 merged 2 commits into
toon-format:mainfrom
apardawala:feat/layout-metadata

Conversation

@apardawala

Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in layout cargo feature exposing structural metadata about how a TOON document was actually written on the wire — tabular vs. list vs. inline arrays, declared [N] lengths, declared field lists, and active delimiters. Useful for downstream tools (validators, linters, formatters) that need to reason about wire shape and not just the decoded value.

API surface (additive only)

// off by default — zero impact on existing callers
pub fn decode_with_layout(input: &str, options: &DecodeOptions)
    -> ToonResult<(serde_json::Value, Layout)>;

#[non_exhaustive] pub struct Layout { /* opaque, JSON-Pointer-keyed */ }
#[non_exhaustive] pub enum NodeLayout { Tabular { .. }, List { .. }, InlineArray { .. } }
#[non_exhaustive] pub struct FieldDescriptor { name, nested }

Layout entries are keyed by JSON Pointer (RFC 6901), with "" for the document root. All public types use #[non_exhaustive] so future additions don't break semver.

Scope

Object key-order and key-folding metadata are intentionally deferred to a follow-up to keep this PR small. FieldDescriptor::nested is reserved for forward compatibility with proposals like spec RFC #46 (nested tabular headers) and is always None today.

The interaction with expand_paths is documented on decode_with_layout: pointers reflect pre-expansion structure.

@apardawala apardawala requested a review from a team as a code owner May 13, 2026 23:44
@shreyasbhat0

Copy link
Copy Markdown
Member

Nice work on this — the API design is clean and the feature-gating approach is solid.

Before we merge, could you share what your use case is for this? Are you building a formatter, linter, or some other tooling on top of toon-rust that needs the layout metadata? Just want to understand the motivation so we can make sure the API fits real-world needs.

@apardawala

Copy link
Copy Markdown
Contributor Author

Thanks Shreyas!

The motivation is schema validation — specifically, I'm working toward a "TOON Schema" sidecar (a small JSON Schema 2020-12 dialect with TOON-specific keywords) so tools can validate things like:

  • declared [N] length matches the actual row/element count
  • tabular fields match a required column list (and in the right order)
  • a field that the schema says should be tabular wasn't emitted as a list, etc.

None of that is recoverable from the decoded Value alone — by the time you've got JSON in hand, you've lost the wire shape. That's why I needed the layout sidecar before I could prototype the validator.

The plan is roughly:

  1. This PR — layout metadata behind a feature flag (foundation)
  2. Follow-up PR — object key_order + key_folded metadata (same pattern, separate review)
  3. RFC on toon-format/spec — propose the TOON Schema dialect itself, get the keyword set right before shipping
  4. toon-schema crate — reference validator, composes with toon-rust via this feature; lives separately so the core crate stays lean

@shreyasbhat0

Copy link
Copy Markdown
Member

Hey @apardawala — thanks for the detailed explanation of your use case, the schema validation direction makes sense as a concept.

Before we move forward with this, there's some important context we want to flag. @johannschopplich closed spec issues toon-format/spec#7 and toon-format/spec#17 with a clear position on keeping schema concerns separate from the core:

"I'd rather keep SPEC.md scoped to one job: how JSON values are serialized as text."

"The right home for this is a companion document."

Your plan mentions filing a spec RFC as step 3, but that hasn't happened yet and there's no community discussion around "TOON Schema" as a concept. This PR adds layout tracking into the core decoder — even behind a feature flag, that's a meaningful addition to the core crate's responsibility, and we want to make sure the direction has buy-in before merging infrastructure for it.

@johannschopplich — would appreciate your thoughts here. Specifically:

  1. Does layout metadata (tabular vs list vs inline, declared lengths, field descriptors) belong in the core decoder behind a feature flag, or should it live in a separate companion crate?
  2. Is a "TOON Schema" dialect something you'd want to see proposed as an RFC on the spec repo?

We've created an issue to track this: #69

Putting this on hold until we have clarity on the direction. The code itself is clean and well-designed — this is purely about whether it's the right home for it.

@johannschopplich

Copy link
Copy Markdown
Contributor

@shreyasbhat0 Thanks for flagging, much appreciated. 🙂

Yes, TOON is a transport format in its core, that's why I've closed the schema approach in the past.

Implementations shall follow the official spec – at the same time I'm open to experimentation. The format is not set in stone. If some kind of schema does improve the usefulness of TOON, I'm happy to reconsider.

This repo can be a playground. From a SOC-first perspective, a separate crate would make sense. But that would bloat the package surface. I'm equally open to merging this behind a feature flag with a clear note that said schema feature is an independant development of this package.

@apardawala

Copy link
Copy Markdown
Contributor Author

Thanks @johannschopplich — pushed f8e90b2 with the "experimental, independent of spec" note in four places: README cargo features, crate::layout module docs, decode_with_layout function docs, and the Cargo.toml feature comment.

Also drafting an RFC on toon-format/spec to frame the broader schema effort as the companion document from your #7 closing comment. Will link here when filed.

@shreyasbhat0 — does the wording in f8e90b2 address #69, or would you like changes?

@shreyasbhat0

Copy link
Copy Markdown
Member

Thanks for the update @apardawala, and for adding the experimental disclaimers — that looks good.

Before we can continue reviewing, this PR has merge conflicts that need to be resolved against main. Could you rebase or merge to get it back to a clean state?

A few code-level notes from an initial pass:

  1. Unrelated import reformattingparser.rs has several cosmetic import changes (collapsing multi-line imports to single-line) that aren't related to the layout feature. These add noise to the diff. Would you mind reverting those so the diff only contains layout-related changes? (Or split them into a separate formatting commit.)

  2. _segment naming in layout_push — The underscore prefix convention signals "intentionally unused", but the parameter is used when the layout feature is enabled. Consider dropping the underscore and using #[allow(unused_variables)] on the method, or a let _ = segment; fallback in the non-feature path, to be more explicit.

  3. Shared path-expansion logicdecode_with_layout duplicates the path-expansion block from decode(). Could you extract a small shared helper so the two don't drift apart over time?

Will do a deeper review once the conflicts are resolved. The overall design and test coverage look solid.

@apardawala apardawala force-pushed the feat/layout-metadata branch from f8e90b2 to 400a1e7 Compare May 20, 2026 16:55
@apardawala

Copy link
Copy Markdown
Contributor Author

Thanks @shreyas-ks — addressed all four points (rebased onto v0.4.6):

  1. Unrelated imports in parser.rs reverted to upstream multi-line form (also caught one stray collapse in a test function). Only the new cfg-gated layout import block remains.
  2. _segment-style param names cleaned up on the layout_* helpers — params now use clean names (segment, length, fields, delimiter) with let _ = ...; in the #[cfg(not(feature = "layout"))] path to silence the unused warning explicitly rather than via prefix convention.
  3. Path-expansion duplication removed — extracted apply_path_expansion(value, options) in src/decode/mod.rs, used by both decode and decode_with_layout.
  4. Rebase: Cargo.toml keeps version = "0.4.6" + both json_stream and layout features; src/lib.rs re-exports merged cleanly.

cargo test, cargo test --features layout, and cargo clippy --all-features --tests -- -D warnings all green.

@shreyasbhat0

Copy link
Copy Markdown
Member

@apardawala can you please fix failing fmt CI

@apardawala apardawala force-pushed the feat/layout-metadata branch from 400a1e7 to 8e3c7cc Compare May 21, 2026 17:06
@shreyasbhat0 shreyasbhat0 merged commit 4a2cde8 into toon-format:main May 22, 2026
3 checks passed
@github-actions github-actions Bot mentioned this pull request May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants