Skip to content

Add API-equivalent pricing to llm-cost CLI#8

Merged
github-actions[bot] merged 1 commit into
mainfrom
sunny/llm-cost-pricing
May 28, 2026
Merged

Add API-equivalent pricing to llm-cost CLI#8
github-actions[bot] merged 1 commit into
mainfrom
sunny/llm-cost-pricing

Conversation

@riddim-developer-bot
Copy link
Copy Markdown
Contributor

Summary

The OSS `llm-cost` CLI now shows dollar-equivalent cost per bucket alongside the raw token counts. Rates come from a built-in table sourced from anthropic.com/pricing and platform.openai.com/docs/pricing, verified 2026-05-22.

What `./cost EPAC-1940` shows now

```
CODEX (1 session)
Models: gpt-5.5
Turns: 341
First → last: 2026-05-18T19:56:54.104Z → 2026-05-18T21:50:34.098Z
Wall clock span: 1h 53m 40s

Tokens:
input uncached 1,517,206
cache read 51,024,768
output (visible) 44,683
output (reasoning) 18,649
─────────────────────────────────
grand total 52,605,306

API-equivalent pricing (gpt-5.5 @ rates verified 2026-05-22):
input uncached $7.59 (1.5M × $5.00/1M)
cache read $25.51 (51.0M × $0.500/1M)
output (visible) $1.34 (44.7K × $30.00/1M)
output (reasoning) $0.56 (18.6K × $30.00/1M)
───────────────────────────────────────────
total API cost $35.00 [hypothetical — your Codex Pro plan covers this]

Quota (plan_type=pro, 341 samples):
5h primary 58% → 64% used (peak 64%, this issue moved +6.0 pp)
7d secondary 56% → 57% used (peak 57%, this issue moved +1.0 pp)
```

EPAC-1940 prices out to $35.00 hypothetical API equivalent but actually moved the user's Codex Pro 7-day quota by +1.0 pp (~1/100th of the weekly allowance). Both readouts in one shot.

Honest framing

The README and the in-CLI `[hypothetical — ...]` suffix make clear that the dollar number is a counterfactual — what the token volume would have cost on pay-as-you-go API, not actual spend on a subscription plan. The Codex quota readout is closer to real marginal cost on Codex Pro / Codex Business; the dollar total is for cross-plan comparison and external audiences who think in dollars.

Implementation

New files:

  • `src/pricing-rates.mjs` — the rate table (Anthropic Opus/Sonnet/Haiku across versions, OpenAI gpt-5.5/5.4/5.4-mini/5.4-nano/5.3-codex). Each entry carries `verifiedOn` and `sourceUrl`.
  • `src/pricing.mjs` — `normalizeModelName`, `ratesForModel`, `calculateCost`, `hypotheticalNoteFor`, freshness helpers.
  • `test/pricing.test.mjs` — 18 tests, including a pin on the EPAC-1940 $35.00 total so any rate change or calc bug is caught loudly.

CLI additions:

  • Pricing block rendered by default per provider.
  • `--no-pricing` flag to suppress.
  • `--json` output gains a `pricing` field per provider.
  • Quota line now reports the per-issue `+X.X pp` delta on each window.
  • Warns when the bundled table is more than `STALE_AFTER_DAYS` (90) days old.

Test plan

  • `node --check` clean on every new .mjs
  • 51 of 51 tests pass via `node --test` (was 33; +18 pricing tests)
  • `node packages/llm-cost-attribution/bin/llm-cost.mjs EPAC-1940 --from-usage ` produces the exact output shown above
  • `--no-pricing` suppresses the block; `--json` includes pricing data when present
  • EPAC-1940 $35.00 pin: a unit test reproduces the exact total from the bake's actual token counts, so any rate-table or calc regression will fail loudly

Follow-ups (not in this PR)

  • Software-factory's `src/cost-attribution/` still contains the old pricing layer that nothing uses. Now that the OSS package owns this functionality, that code is dead and can be deleted in a follow-up PR.
  • The bundle's `./cost` wrapper will automatically pick up the new output once a fresh bundle is built from this branch.

Ports the rate table that lived in software-factory's cost-attribution
bounded context into the OSS package, plus the calculation logic to
turn token-bucket totals into USD per provider per-issue.

New library exports:

  - PRICING_TABLE                  the rate data (Anthropic + OpenAI)
  - calculateCost(model, buckets)  per-bucket cost breakdown + total
  - ratesForModel(model)           lookup with normalization
  - normalizeModelName(model)      strip `-YYYYMMDD`, `-latest`, dash-to-dot
  - hypotheticalNoteFor(prov, plan) UI text helper
  - daysSincePricingVerified()     freshness check
  - isPricingStale()               true after STALE_AFTER_DAYS (90)
  - STALE_AFTER_DAYS

CLI changes:

  - Default output now includes an "API-equivalent pricing" block
    per provider, with per-bucket math:
      input uncached        $7.59    (1.5M × $5.00/1M)
      cache read           $25.51    (51.0M × $0.500/1M)
      output (visible)      $1.34    (44.7K × $30.00/1M)
      output (reasoning)    $0.56    (18.6K × $30.00/1M)
      total API cost       $35.00    [hypothetical — your Codex Pro plan covers this]
    The "[hypothetical — ...]" suffix is plan-aware (uses the planType
    from the Codex quota sample when present).
  - The Quota line now reports the per-issue delta:
      5h  primary  58% → 64% used  (peak 64%, this issue moved +6.0 pp)
  - --no-pricing flag suppresses the pricing block.
  - --json output gains a `pricing` field per provider when rates known.
  - Warns when the rate table is more than 90 days old.

Honest framing in the README: this is a *counterfactual* (what the
token volume would have cost on pay-as-you-go API), not actual spend
on a subscription plan. The Codex quota readout is closer to your
real marginal cost; the dollar number is for cross-plan comparison.

Tested against real data — the EPAC-1940 Codex run on this machine
prices out to $35.00 (1.5M uncached + 51M cached + 44.7K visible +
18.6K reasoning), which matches the hand-computed value. A test in
pricing.test.mjs pins that exact total to catch any future regression.

51 of 51 package tests pass (was 33; added 18 pricing tests).
@github-actions github-actions Bot enabled auto-merge (squash) May 28, 2026 01:34
@github-actions github-actions Bot merged commit 1c792d0 into main May 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant