Add usage.jsonl read + backfill to llm-cost-attribution#4
Merged
Conversation
Implements the consumer side of the Symphony Coding-Agent Cost
Telemetry Extension spec (groove/specs/symphony-cost-telemetry-extension)
so users can:
1. Read cost data from a spec-compliant usage.jsonl source instead of
the raw CLI transcripts.
2. Backfill a usage.jsonl from existing transcripts, after which the
transcripts can be safely deleted.
New library exports:
- computeIssueCostFromUsage(issueId, pathOrDir)
- backfillUsageFromTranscripts({ outFile, onProgress, ... })
- rollupUsageRecords / sessionToUsageRecords
- readUsageRecords / appendUsageRecords / validateUsageRecord
- findUsageFiles / SCHEMA_VERSION
New CLI surface:
llm-cost backfill --out <path>
llm-cost <ISSUE> --from-usage <path-or-dir>
End-to-end verified on real data (4,309 sessions / 5 GB transcripts):
- Backfill: produces 190,481 spec-compliant records in an 83 MB file
(60x compression vs the source transcripts).
- Read-back: `llm-cost EPAC-1940 --from-usage <backfilled-file>`
returns identical 52,605,306 tokens / 341 turns / 1h53m wall clock
to the transcript-source path, in 0.3 seconds (vs ~3 minutes for
the transcript scan).
Fidelity tradeoffs (documented in README):
- Cache-tier split (Claude 5m vs 1h cache creation) is collapsed.
- Reasoning-vs-visible output split (Codex) is collapsed.
- Per-window quota samples (Codex rate_limits) are not in the spec
schema, so they're lost on backfill.
Grand totals, turn ordinals, models, timestamps, runIDs, and
workspacePath provenance are all preserved exactly.
New tests (16 of them, in two new files):
- test/usage-jsonl.test.mjs — validate + read + write
- test/transcript-to-usage.test.mjs — session → usage record mapping
All 27 package tests pass. CI workflow updated to run the new files.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the consumer side of the Symphony Coding-Agent Cost Telemetry Extension spec (PR #3) inside the
llm-cost-attributionpackage (PR #2). Lets users delete their transcripts after baking the cost-relevant projection into a much smallerusage.jsonlfile.Real-world numbers (this machine, 4,309 sessions)
llm-cost EPAC-1940query timeBackfill emitted 190,481 spec-compliant records from 4,309 sessions; 1,841 sessions were correctly skipped (ad-hoc CLI work outside any Symphony workspace).
New library API
New CLI surface
--from-usageaccepts either a single.jsonlfile or a directory ofusage*.jsonlfiles (per spec §4.1's "writers MAY split, readers MUST concatenate" rule).Fidelity tradeoffs (called out in README)
The spec deliberately drops three things from the raw transcripts. After backfill you lose:
rate_limits.{primary,secondary}.used_percentquota samples — not in the spec schemaGrand totals, per-turn ordinals, models, timestamps, runIDs, and
workspacePathprovenance are preserved exactly.Spec conformance (§5.1 Required fields)
Every backfilled record carries:
schemaVersion,recordedAt,runID(UUID; the CLI session ID),turn(1-based monotonic),issueIdentifier,provider,model,botRole(alwaysdeveloper— spec §5.1 says "Implementations that do not distinguish a reviewer role MUST emitdeveloper"),inputTokens,outputTokens,totalTokens,usageSource: "provider_reported",startedAt,endedAt. Plus the optionalworkspacePathsince we already have it.Test plan
node --test packages/llm-cost-attribution/test/*.test.mjs) — 11 existing + 8 new inusage-jsonl.test.mjs+ 5 new intranscript-to-usage.test.mjsvalidateUsageRecordllm-cost EPAC-1940andllm-cost EPAC-1940 --from-usage <backfill>produce identical token totals, turn counts, and wall-clock spansnode --checkclean on every .mjs