✨ feat(benchmarks): add KDL comparison format#301
Draft
ThatXliner wants to merge 1 commit into
Draft
Conversation
KDL serves a different niche (configuration files with comments, type annotations, node-based nesting) vs TOON's LLM-optimized tabular design. Including it in benchmarks lets readers see how the two formats compare on token efficiency: TOON is 15% larger for nested data but 31% smaller for flat tabular structures — consistent with each format's design goals. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Author
|
should be fine if the benchmarks were modified a bit, right? |
Collaborator
No rush since this is a draft. One thing to pin down before you move out of draft: the regenerated That suggests dataset re-seeding or some other non-determinism in the bench, rather than a pure KDL addition. Worth resolving so KDL's numbers are comparable to the existing baselines. |
Author
|
I tried running benchmarks/scripts/token-efficiency-benchmark.ts but I'm getting the same result as the one in this PR... And when I read the code, there doesn't seem to be anything indicating a change in seed? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Linked Issue
Closes #
Description
KDL serves a different niche (configuration files with comments, type annotations, node-based nesting) vs TOON's LLM-optimized tabular design. Including it in benchmarks lets readers see how the two formats compare on token efficiency: TOON is 15% larger for nested data but 31% smaller for flat tabular structures — consistent with each format's design goals.
Type of Change
Changes Made
Added KDL comparison
SPEC Compliance
Testing
Pre-submission Checklist
Breaking Changes
Additional Context
This PR is currently fully LLM-generated. I will review this code myself before I mark it ready to review.