Skip to content

✨ feat(benchmarks): add KDL comparison format#301

Draft
ThatXliner wants to merge 1 commit into
toon-format:mainfrom
ThatXliner:add-kdl-comparison
Draft

✨ feat(benchmarks): add KDL comparison format#301
ThatXliner wants to merge 1 commit into
toon-format:mainfrom
ThatXliner:add-kdl-comparison

Conversation

@ThatXliner

Copy link
Copy Markdown

Linked Issue

Closes #

Description

KDL serves a different niche (configuration files with comments, type annotations, node-based nesting) vs TOON's LLM-optimized tabular design. Including it in benchmarks lets readers see how the two formats compare on token efficiency: TOON is 15% larger for nested data but 31% smaller for flat tabular structures — consistent with each format's design goals.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement

Changes Made

Added KDL comparison

SPEC Compliance

  • This PR implements/fixes spec compliance
  • Spec section(s) affected:
  • Spec version:

Testing

  • All existing tests pass
  • Added new tests for changes
  • Tests cover edge cases and spec compliance

Pre-submission Checklist

  • My code follows the project's coding standards
  • I have run code formatting/linting tools
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation if needed
  • I have reviewed the TOON specification for relevant sections

Breaking Changes

  • No breaking changes
  • Breaking changes (describe migration path below)

Additional Context

This PR is currently fully LLM-generated. I will review this code myself before I mark it ready to review.

KDL serves a different niche (configuration files with comments, type
annotations, node-based nesting) vs TOON's LLM-optimized tabular design.
Including it in benchmarks lets readers see how the two formats compare
on token efficiency: TOON is 15% larger for nested data but 31% smaller
for flat tabular structures — consistent with each format's design goals.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ThatXliner

Copy link
Copy Markdown
Author

should be fine if the benchmarks were modified a bit, right?

@johannschopplich

Copy link
Copy Markdown
Collaborator

This was generated by AI during triage.

No rush since this is a draft. One thing to pin down before you move out of draft: the regenerated token-efficiency.md shows numeric drift across all formats, not just adding KDL. JSON, for example, moves from 109,599 to 109,574 tokens.

That suggests dataset re-seeding or some other non-determinism in the bench, rather than a pure KDL addition. Worth resolving so KDL's numbers are comparable to the existing baselines.

@ThatXliner

Copy link
Copy Markdown
Author

I tried running benchmarks/scripts/token-efficiency-benchmark.ts but I'm getting the same result as the one in this PR...

And when I read the code, there doesn't seem to be anything indicating a change in seed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants