Draft Symphony Coding-Agent Cost Telemetry Extension spec by riddim-developer-bot[bot] · Pull Request #3 · RiddimSoftware/groove

riddim-developer-bot · 2026-05-27T20:17:35Z

Summary

Draft v0.1 of an open-source extension to the OpenAI Symphony specification. Defines `usage.jsonl` — a persistent, vendor-neutral on-disk format for coding-agent cost telemetry — and nothing else.

Lands at `specs/symphony-cost-telemetry-extension/SPEC.md` following groove's existing `specs/` layout convention.

Framing

Provider transcripts (Claude session JSONL, Codex rollouts, etc.) are roughly 1,650× larger than `usage.jsonl` in our aggregate data and contain mostly conversation content that is irrelevant to cost. `usage.jsonl` is the projection of coding-agent activity that keeps the cost signal and discards the conversation noise — one row per turn, ~1 KB per row, vendor-neutral, joinable to the issue tracker, ships anywhere.

The spec scopes itself tightly:

Defines: record schema, file naming convention, JSONL encoding, schema versioning rule, correlation keys.
Does NOT define: writer behavior (rotation, atomicity, retention, compression), pricing models, transport, query surfaces, other Symphony observability.

This is deliberate. Operational concerns differ by deployment; defining the data is the spec's job, defining how to write it safely is the implementation's.

Field summary

Required (13): `schemaVersion`, `recordedAt`, `runID`, `turn`, `issueIdentifier`, `provider`, `model`, `botRole`, `inputTokens`/`outputTokens`/`totalTokens` (with null semantics), `usageSource`, `startedAt`, `endedAt`.

Optional (13): `issueID`, `pullRequest` (with `headSHA`), `mode`, `effort`, `exitReason`, `promptBytes`, `estimatedTokenMethod`, `estimatedPromptInputTokens`, `estimate`, `workspacePath`, `reviewerMode`, `experimentAssignment`, `configuredWeight`/`effectiveWeight`, `cooldownReason`.

Size

455 lines, about 21% of the parent spec's 2,169 — proportional to the narrower scope.

History

This spec briefly landed in RiddimSoftware/software-factory#61 before being re-routed here. A companion PR removes it from software-factory to avoid a forked copy.

Test plan

Read top-to-bottom for tone match with the parent spec.
Sanity-check that every REQUIRED field has a documented cost-attribution use.
Confirm §9 ("Out of Scope") accurately captures what's punted (writer behavior, pricing, transport, query, other Symphony observability).
Confirm the spec doesn't accidentally require autopilot-specific behavior (e.g. ISO-week rotation, atomic rename) — those were intentionally removed.

Draft spec extending the OpenAI Symphony specification with a persistent, vendor-neutral on-disk format for coding-agent cost telemetry. Defines a single stream — usage.jsonl — as the projection of coding-agent activity that keeps the cost signal and discards conversation noise. Key design choices: - One stream only (usage.jsonl); other Symphony observability is explicitly out of scope and left for a separate extension. - Vendor-neutral schema decoupled from coding-agent transcript formats, so one cost reader works across Claude, Codex, Gemini, etc. - Storage conventions cover canonical location + JSONL encoding only; writer behavior (rotation, atomicity, retention, compression) is explicitly operator-defined. - Schema versioning rule: bump on breaking change, not on additive optional fields; readers tolerate unknown fields and enum values. - 13 required fields + 13 optional fields, every one with a documented cost-attribution use. 455 lines (about 21% of the parent spec's 2,169 lines), proportional to the narrower scope.

## Summary Implements the consumer side of [the Symphony Coding-Agent Cost Telemetry Extension spec](https://github.com/RiddimSoftware/groove/blob/main/specs/symphony-cost-telemetry-extension/SPEC.md) (PR #3) inside the `llm-cost-attribution` package (PR #2). Lets users delete their transcripts after baking the cost-relevant projection into a much smaller `usage.jsonl` file. ```bash llm-cost backfill --out ~/llm-cost-history.jsonl # bake transcripts → spec-compliant usage.jsonl llm-cost EPAC-1940 --from-usage ~/llm-cost-history.jsonl # query the bake instead of transcripts rm -rf ~/.claude/projects ~/.codex/sessions # safe — cost data is in the bake now ``` ## Real-world numbers (this machine, 4,309 sessions) | | Before backfill | After backfill | |---|---:|---:| | Disk footprint | 5.0 GB | **83 MB** (60× smaller) | | `llm-cost EPAC-1940` query time | ~3 min (full Codex scan) | **~0.3 s** (~600× faster) | | EPAC-1940 token total | 52,605,306 | **52,605,306** ✓ | | EPAC-1940 turn count | 341 | **341** ✓ | | EPAC-1940 wall clock | 1h 53m 40s | **1h 53m 40s** ✓ | Backfill emitted 190,481 spec-compliant records from 4,309 sessions; 1,841 sessions were correctly skipped (ad-hoc CLI work outside any Symphony workspace). ## New library API ```js import { computeIssueCostFromUsage, backfillUsageFromTranscripts, readUsageRecords, appendUsageRecords, validateUsageRecord, sessionToUsageRecords, rollupUsageRecords, SCHEMA_VERSION, } from 'llm-cost-attribution'; ``` ## New CLI surface ``` llm-cost backfill --out <path> # transcripts → spec-compliant usage.jsonl llm-cost <ISSUE> --from-usage <path> # read from usage.jsonl/dir instead of transcripts ``` `--from-usage` accepts either a single `.jsonl` file or a directory of `usage*.jsonl` files (per spec §4.1's "writers MAY split, readers MUST concatenate" rule). ## Fidelity tradeoffs (called out in README) The spec deliberately drops three things from the raw transcripts. After backfill you lose: - Claude cache-tier split (5m vs 1h cache creation tokens) — collapsed into the input total - Codex reasoning-vs-visible output split — collapsed into the output total - Codex `rate_limits.{primary,secondary}.used_percent` quota samples — not in the spec schema Grand totals, per-turn ordinals, models, timestamps, runIDs, and `workspacePath` provenance are preserved exactly. ## Spec conformance (§5.1 Required fields) Every backfilled record carries: `schemaVersion`, `recordedAt`, `runID` (UUID; the CLI session ID), `turn` (1-based monotonic), `issueIdentifier`, `provider`, `model`, `botRole` (always `developer` — spec §5.1 says "Implementations that do not distinguish a reviewer role MUST emit `developer`"), `inputTokens`, `outputTokens`, `totalTokens`, `usageSource: "provider_reported"`, `startedAt`, `endedAt`. Plus the optional `workspacePath` since we already have it. ## Test plan - [x] All 27 package tests pass (`node --test packages/llm-cost-attribution/test/*.test.mjs`) — 11 existing + 8 new in `usage-jsonl.test.mjs` + 5 new in `transcript-to-usage.test.mjs` - [x] Every backfilled record produced from real transcripts passes `validateUsageRecord` - [x] `llm-cost EPAC-1940` and `llm-cost EPAC-1940 --from-usage <backfill>` produce identical token totals, turn counts, and wall-clock spans - [x] `node --check` clean on every .mjs - [x] CI workflow updated to run the new test files Co-authored-by: Sunny Purewal <sunny@riddimsoftware.com>

github-actions Bot enabled auto-merge (squash) May 27, 2026 20:17

github-actions Bot merged commit 3341f24 into main May 27, 2026
2 checks passed

riddim-developer-bot Bot mentioned this pull request May 27, 2026

Add usage.jsonl read + backfill to llm-cost-attribution #4

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft Symphony Coding-Agent Cost Telemetry Extension spec#3

Draft Symphony Coding-Agent Cost Telemetry Extension spec#3
github-actions[bot] merged 1 commit into
mainfrom
sunny/symphony-telemetry-spec

riddim-developer-bot Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

riddim-developer-bot Bot commented May 27, 2026

Summary

Framing

Field summary

Size

History

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant