Skip to content

Draft Symphony Coding-Agent Cost Telemetry Extension spec#3

Merged
github-actions[bot] merged 1 commit into
mainfrom
sunny/symphony-telemetry-spec
May 27, 2026
Merged

Draft Symphony Coding-Agent Cost Telemetry Extension spec#3
github-actions[bot] merged 1 commit into
mainfrom
sunny/symphony-telemetry-spec

Conversation

@riddim-developer-bot
Copy link
Copy Markdown
Contributor

Summary

Draft v0.1 of an open-source extension to the OpenAI Symphony specification. Defines `usage.jsonl` — a persistent, vendor-neutral on-disk format for coding-agent cost telemetry — and nothing else.

Lands at `specs/symphony-cost-telemetry-extension/SPEC.md` following groove's existing `specs/` layout convention.

Framing

Provider transcripts (Claude session JSONL, Codex rollouts, etc.) are roughly 1,650× larger than `usage.jsonl` in our aggregate data and contain mostly conversation content that is irrelevant to cost. `usage.jsonl` is the projection of coding-agent activity that keeps the cost signal and discards the conversation noise — one row per turn, ~1 KB per row, vendor-neutral, joinable to the issue tracker, ships anywhere.

The spec scopes itself tightly:

  • Defines: record schema, file naming convention, JSONL encoding, schema versioning rule, correlation keys.
  • Does NOT define: writer behavior (rotation, atomicity, retention, compression), pricing models, transport, query surfaces, other Symphony observability.

This is deliberate. Operational concerns differ by deployment; defining the data is the spec's job, defining how to write it safely is the implementation's.

Field summary

Required (13): `schemaVersion`, `recordedAt`, `runID`, `turn`, `issueIdentifier`, `provider`, `model`, `botRole`, `inputTokens`/`outputTokens`/`totalTokens` (with null semantics), `usageSource`, `startedAt`, `endedAt`.

Optional (13): `issueID`, `pullRequest` (with `headSHA`), `mode`, `effort`, `exitReason`, `promptBytes`, `estimatedTokenMethod`, `estimatedPromptInputTokens`, `estimate`, `workspacePath`, `reviewerMode`, `experimentAssignment`, `configuredWeight`/`effectiveWeight`, `cooldownReason`.

Size

455 lines, about 21% of the parent spec's 2,169 — proportional to the narrower scope.

History

This spec briefly landed in RiddimSoftware/software-factory#61 before being re-routed here. A companion PR removes it from software-factory to avoid a forked copy.

Test plan

  • Read top-to-bottom for tone match with the parent spec.
  • Sanity-check that every REQUIRED field has a documented cost-attribution use.
  • Confirm §9 ("Out of Scope") accurately captures what's punted (writer behavior, pricing, transport, query, other Symphony observability).
  • Confirm the spec doesn't accidentally require autopilot-specific behavior (e.g. ISO-week rotation, atomic rename) — those were intentionally removed.

Draft spec extending the OpenAI Symphony specification with a
persistent, vendor-neutral on-disk format for coding-agent cost
telemetry. Defines a single stream — usage.jsonl — as the projection
of coding-agent activity that keeps the cost signal and discards
conversation noise.

Key design choices:
- One stream only (usage.jsonl); other Symphony observability is
  explicitly out of scope and left for a separate extension.
- Vendor-neutral schema decoupled from coding-agent transcript formats,
  so one cost reader works across Claude, Codex, Gemini, etc.
- Storage conventions cover canonical location + JSONL encoding only;
  writer behavior (rotation, atomicity, retention, compression) is
  explicitly operator-defined.
- Schema versioning rule: bump on breaking change, not on additive
  optional fields; readers tolerate unknown fields and enum values.
- 13 required fields + 13 optional fields, every one with a documented
  cost-attribution use.

455 lines (about 21% of the parent spec's 2,169 lines), proportional
to the narrower scope.
@github-actions github-actions Bot enabled auto-merge (squash) May 27, 2026 20:17
@github-actions github-actions Bot merged commit 3341f24 into main May 27, 2026
2 checks passed
github-actions Bot pushed a commit that referenced this pull request May 27, 2026
## Summary
Implements the consumer side of [the Symphony Coding-Agent Cost
Telemetry Extension
spec](https://github.com/RiddimSoftware/groove/blob/main/specs/symphony-cost-telemetry-extension/SPEC.md)
(PR #3) inside the `llm-cost-attribution` package (PR #2). Lets users
delete their transcripts after baking the cost-relevant projection into
a much smaller `usage.jsonl` file.

```bash
llm-cost backfill --out ~/llm-cost-history.jsonl   # bake transcripts → spec-compliant usage.jsonl
llm-cost EPAC-1940 --from-usage ~/llm-cost-history.jsonl   # query the bake instead of transcripts
rm -rf ~/.claude/projects ~/.codex/sessions   # safe — cost data is in the bake now
```

## Real-world numbers (this machine, 4,309 sessions)

| | Before backfill | After backfill |
|---|---:|---:|
| Disk footprint | 5.0 GB | **83 MB** (60× smaller) |
| `llm-cost EPAC-1940` query time | ~3 min (full Codex scan) | **~0.3
s** (~600× faster) |
| EPAC-1940 token total | 52,605,306 | **52,605,306** ✓ |
| EPAC-1940 turn count | 341 | **341** ✓ |
| EPAC-1940 wall clock | 1h 53m 40s | **1h 53m 40s** ✓ |

Backfill emitted 190,481 spec-compliant records from 4,309 sessions;
1,841 sessions were correctly skipped (ad-hoc CLI work outside any
Symphony workspace).

## New library API

```js
import {
  computeIssueCostFromUsage,
  backfillUsageFromTranscripts,
  readUsageRecords,
  appendUsageRecords,
  validateUsageRecord,
  sessionToUsageRecords,
  rollupUsageRecords,
  SCHEMA_VERSION,
} from 'llm-cost-attribution';
```

## New CLI surface

```
llm-cost backfill --out <path>            # transcripts → spec-compliant usage.jsonl
llm-cost <ISSUE> --from-usage <path>      # read from usage.jsonl/dir instead of transcripts
```

`--from-usage` accepts either a single `.jsonl` file or a directory of
`usage*.jsonl` files (per spec §4.1's "writers MAY split, readers MUST
concatenate" rule).

## Fidelity tradeoffs (called out in README)

The spec deliberately drops three things from the raw transcripts. After
backfill you lose:

- Claude cache-tier split (5m vs 1h cache creation tokens) — collapsed
into the input total
- Codex reasoning-vs-visible output split — collapsed into the output
total
- Codex `rate_limits.{primary,secondary}.used_percent` quota samples —
not in the spec schema

Grand totals, per-turn ordinals, models, timestamps, runIDs, and
`workspacePath` provenance are preserved exactly.

## Spec conformance (§5.1 Required fields)

Every backfilled record carries: `schemaVersion`, `recordedAt`, `runID`
(UUID; the CLI session ID), `turn` (1-based monotonic),
`issueIdentifier`, `provider`, `model`, `botRole` (always `developer` —
spec §5.1 says "Implementations that do not distinguish a reviewer role
MUST emit `developer`"), `inputTokens`, `outputTokens`, `totalTokens`,
`usageSource: "provider_reported"`, `startedAt`, `endedAt`. Plus the
optional `workspacePath` since we already have it.

## Test plan
- [x] All 27 package tests pass (`node --test
packages/llm-cost-attribution/test/*.test.mjs`) — 11 existing + 8 new in
`usage-jsonl.test.mjs` + 5 new in `transcript-to-usage.test.mjs`
- [x] Every backfilled record produced from real transcripts passes
`validateUsageRecord`
- [x] `llm-cost EPAC-1940` and `llm-cost EPAC-1940 --from-usage
<backfill>` produce identical token totals, turn counts, and wall-clock
spans
- [x] `node --check` clean on every .mjs
- [x] CI workflow updated to run the new test files

Co-authored-by: Sunny Purewal <sunny@riddimsoftware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant