From 3803451ceb1c9a7cb33f10f69fd501261784e96e Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Sat, 2 May 2026 18:03:48 +0000 Subject: [PATCH] docs: document pricing diagnostics, metrics endpoint, and DiffOutcome per-1k token fields - operations-and-policy.md: expand the pricing/model change detection section to document the six new DiffOutcome per-1k token rate fields, the CLI 'Per-1k token prices' output line (condition: all four input/output rates present), the pricing.prices HTTP response object (always present, null for unset rates), and the web UI banner behaviour. - sdk.md: add a 'GET /v1/metrics (no SDK wrapper)' subsection showing how to call the endpoint directly via httpx, with example response and a link to the HTTP API reference. - web-ui.md: update the DiffPage pricing change warning bullet to mention the new per-1k input/output price delta line in the fd-alert--warn banner (rendered when all four rates are present; toFixed(6) formatting). Co-authored-by: Gottam Sai Bharath --- docs/operations-and-policy.md | 55 +++++++++++++++++++++++++++++++---- docs/sdk.md | 26 +++++++++++++++++ docs/web-ui.md | 6 +++- 3 files changed, 81 insertions(+), 6 deletions(-) diff --git a/docs/operations-and-policy.md b/docs/operations-and-policy.md index db3689e..eda5583 100644 --- a/docs/operations-and-policy.md +++ b/docs/operations-and-policy.md @@ -154,16 +154,61 @@ following differ between baseline and candidate: - `spec.pricing_reference.pricing_version` (e.g. `"openai-2026-04-30"` vs. a newer table) - `spec.runtime.model` (e.g. `"gpt-4.1-mini"` vs. `"gpt-4.1"`) -When this flag is `True`, the CLI prints a note: +`DiffOutcome` also carries the resolved per-1k token rates for each side directly: + +| Field | Description | +|-------|-------------| +| `baseline_input_usd_per_1k_tokens` | Input rate from the baseline pricing table entry (or `None` when not found) | +| `baseline_output_usd_per_1k_tokens` | Output rate from the baseline pricing table entry (or `None`) | +| `baseline_cached_input_usd_per_1k_tokens` | Cached-input rate for baseline (or `None` when not set in the table) | +| `candidate_input_usd_per_1k_tokens` | Input rate from the candidate pricing table entry (or `None`) | +| `candidate_output_usd_per_1k_tokens` | Output rate from the candidate pricing table entry (or `None`) | +| `candidate_cached_input_usd_per_1k_tokens` | Cached-input rate for candidate (or `None`) | + +These fields are populated by `pricing_entry_for(table, model)` in `flightdeck.ledger` after +`diff_releases` returns and before the `DiffOutcome` is constructed. + +**CLI output** — when `pricing_or_model_changed` is `True`, the CLI prints: ``` NOTE: cost delta includes pricing/model assumption changes (pricing reference and/or model differ). +Per-1k token prices: input 0.005000 -> 0.004500, output 0.015000 -> 0.013500 ``` -The HTTP API's `/v1/diff` response includes `pricing.pricing_or_model_changed: true` in the -`pricing` block, and the web UI's `DiffPage` shows an `fd-alert--warn` banner. This is an -informational signal — the diff still computes and the policy still evaluates; cost deltas may -reflect pricing assumption changes in addition to actual usage changes. +The **Per-1k token prices** line is only printed when both input and output rates are present +for both sides. If any rate is `None`, that line is omitted. + +**HTTP API** — `/v1/diff` includes a `pricing.prices` object alongside the existing +`pricing_or_model_changed` flag: + +```json +"pricing": { + "baseline_provider": "openai", + "baseline_version": "2024-02", + "baseline_model": "gpt-4o", + "candidate_provider": "openai", + "candidate_version": "2024-05", + "candidate_model": "gpt-4o", + "pricing_or_model_changed": true, + "prices": { + "baseline_input_usd_per_1k_tokens": 0.005, + "baseline_output_usd_per_1k_tokens": 0.015, + "baseline_cached_input_usd_per_1k_tokens": null, + "candidate_input_usd_per_1k_tokens": 0.0045, + "candidate_output_usd_per_1k_tokens": 0.0135, + "candidate_cached_input_usd_per_1k_tokens": null + } +} +``` + +`pricing.prices` is always present in the response (not gated on `pricing_or_model_changed`). +Fields are `null` when the rate is not set in the pricing table. + +**Web UI** — the `DiffPage` `fd-alert--warn` banner shows the per-1k input/output price deltas +(baseline → candidate) when all four rates are present. See [web-ui.md § DiffPage](web-ui.md). + +This is an informational signal — the diff still computes and the policy still evaluates; cost +deltas may reflect pricing assumption changes in addition to actual usage changes. Cross-provider diffs (e.g. OpenAI baseline vs. Anthropic candidate) are supported as long as separate pricing tables for each provider/version are imported. Each side is priced against its diff --git a/docs/sdk.md b/docs/sdk.md index fc4563a..370e1ad 100644 --- a/docs/sdk.md +++ b/docs/sdk.md @@ -91,6 +91,32 @@ See [SECURITY.md](../SECURITY.md) for the full access model. `GET /health` — returns `{"status": "ok", "mutation_auth": "loopback"|"bearer"}` when the server is up (`mutation_auth` describes promote/rollback auth; see **HTTP API**). +### `GET /v1/metrics` (no SDK wrapper) + +The metrics endpoint has no dedicated SDK method. Call it directly via `httpx` or `requests`: + +```python +import httpx + +resp = httpx.get("http://127.0.0.1:8765/v1/metrics") +resp.raise_for_status() +counters = resp.json() +# { +# "counters": { +# "releases_total": 3, +# "pricing_tables_total": 1, +# "run_events_total": 120, +# "promoted_pointers_total": 1, +# "actions_total": 5, +# "actions_by_action": {"promote": 4, "rollback": 1} +# }, +# "schema_version": 3, +# "generated_at": "2026-05-03T12:00:00+00:00" +# } +``` + +`GET /v1/metrics` is read-only and requires no token. See [http-api.md § GET /v1/metrics](http-api.md#get-v1metrics) for the full response shape. + ### `list_releases() -> dict` `GET /v1/releases` — returns `{"releases": [...]}`. Each entry includes `release_id`, diff --git a/docs/web-ui.md b/docs/web-ui.md index 5863ac7..bfbfdb0 100644 --- a/docs/web-ui.md +++ b/docs/web-ui.md @@ -156,7 +156,11 @@ On submit, the raw diff response is parsed and rendered as: - **Pricing change warning:** when the diff response includes a `pricing` block with `pricing_or_model_changed: true`, a `fd-alert--warn` banner is shown in the summary card. It names the baseline and candidate provider/version/model so the user knows the - cost delta includes pricing assumption changes, not just usage changes. + cost delta includes pricing assumption changes, not just usage changes. When the response + also includes a `pricing.prices` block with all four per-1k token rates present, the + banner additionally shows a **Per-1k token prices** line (baseline → candidate, input and + output separately) so the user can separate tariff moves from token volume changes in the + cost delta. Rates are rendered to six decimal places via `toFixed(6)`. - **Metric cards:** cost/run (USD), latency avg (ms), error rate — each showing baseline, candidate, and delta. - **Raw diff JSON** panel (collapsed by default via `JsonPanel`).