From 3803451ceb1c9a7cb33f10f69fd501261784e96e Mon Sep 17 00:00:00 2001
From: Cursor Agent <cursoragent@cursor.com>
Date: Sat, 2 May 2026 18:03:48 +0000
Subject: [PATCH] docs: document pricing diagnostics, metrics endpoint, and
 DiffOutcome per-1k token fields

- operations-and-policy.md: expand the pricing/model change detection section to document
  the six new DiffOutcome per-1k token rate fields, the CLI 'Per-1k token prices' output
  line (condition: all four input/output rates present), the pricing.prices HTTP response
  object (always present, null for unset rates), and the web UI banner behaviour.

- sdk.md: add a 'GET /v1/metrics (no SDK wrapper)' subsection showing how to call the
  endpoint directly via httpx, with example response and a link to the HTTP API reference.

- web-ui.md: update the DiffPage pricing change warning bullet to mention the new per-1k
  input/output price delta line in the fd-alert--warn banner (rendered when all four rates
  are present; toFixed(6) formatting).

Co-authored-by: Gottam Sai Bharath <Gsbreddy@users.noreply.github.com>
---
 docs/operations-and-policy.md | 55 +++++++++++++++++++++++++++++++----
 docs/sdk.md                   | 26 +++++++++++++++++
 docs/web-ui.md                |  6 +++-
 3 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/docs/operations-and-policy.md b/docs/operations-and-policy.md
index db3689e..eda5583 100644
--- a/docs/operations-and-policy.md
+++ b/docs/operations-and-policy.md
@@ -154,16 +154,61 @@ following differ between baseline and candidate:
 - `spec.pricing_reference.pricing_version` (e.g. `"openai-2026-04-30"` vs. a newer table)
 - `spec.runtime.model` (e.g. `"gpt-4.1-mini"` vs. `"gpt-4.1"`)
 
-When this flag is `True`, the CLI prints a note:
+`DiffOutcome` also carries the resolved per-1k token rates for each side directly:
+
+| Field | Description |
+|-------|-------------|
+| `baseline_input_usd_per_1k_tokens` | Input rate from the baseline pricing table entry (or `None` when not found) |
+| `baseline_output_usd_per_1k_tokens` | Output rate from the baseline pricing table entry (or `None`) |
+| `baseline_cached_input_usd_per_1k_tokens` | Cached-input rate for baseline (or `None` when not set in the table) |
+| `candidate_input_usd_per_1k_tokens` | Input rate from the candidate pricing table entry (or `None`) |
+| `candidate_output_usd_per_1k_tokens` | Output rate from the candidate pricing table entry (or `None`) |
+| `candidate_cached_input_usd_per_1k_tokens` | Cached-input rate for candidate (or `None`) |
+
+These fields are populated by `pricing_entry_for(table, model)` in `flightdeck.ledger` after
+`diff_releases` returns and before the `DiffOutcome` is constructed.
+
+**CLI output** — when `pricing_or_model_changed` is `True`, the CLI prints:
 
 ```
 NOTE: cost delta includes pricing/model assumption changes (pricing reference and/or model differ).
+Per-1k token prices: input 0.005000 -> 0.004500, output 0.015000 -> 0.013500
 ```
 
-The HTTP API's `/v1/diff` response includes `pricing.pricing_or_model_changed: true` in the
-`pricing` block, and the web UI's `DiffPage` shows an `fd-alert--warn` banner. This is an
-informational signal — the diff still computes and the policy still evaluates; cost deltas may
-reflect pricing assumption changes in addition to actual usage changes.
+The **Per-1k token prices** line is only printed when both input and output rates are present
+for both sides. If any rate is `None`, that line is omitted.
+
+**HTTP API** — `/v1/diff` includes a `pricing.prices` object alongside the existing
+`pricing_or_model_changed` flag:
+
+```json
+"pricing": {
+  "baseline_provider": "openai",
+  "baseline_version": "2024-02",
+  "baseline_model": "gpt-4o",
+  "candidate_provider": "openai",
+  "candidate_version": "2024-05",
+  "candidate_model": "gpt-4o",
+  "pricing_or_model_changed": true,
+  "prices": {
+    "baseline_input_usd_per_1k_tokens": 0.005,
+    "baseline_output_usd_per_1k_tokens": 0.015,
+    "baseline_cached_input_usd_per_1k_tokens": null,
+    "candidate_input_usd_per_1k_tokens": 0.0045,
+    "candidate_output_usd_per_1k_tokens": 0.0135,
+    "candidate_cached_input_usd_per_1k_tokens": null
+  }
+}
+```
+
+`pricing.prices` is always present in the response (not gated on `pricing_or_model_changed`).
+Fields are `null` when the rate is not set in the pricing table.
+
+**Web UI** — the `DiffPage` `fd-alert--warn` banner shows the per-1k input/output price deltas
+(baseline → candidate) when all four rates are present. See [web-ui.md § DiffPage](web-ui.md).
+
+This is an informational signal — the diff still computes and the policy still evaluates; cost
+deltas may reflect pricing assumption changes in addition to actual usage changes.
 
 Cross-provider diffs (e.g. OpenAI baseline vs. Anthropic candidate) are supported as long as
 separate pricing tables for each provider/version are imported. Each side is priced against its
diff --git a/docs/sdk.md b/docs/sdk.md
index fc4563a..370e1ad 100644
--- a/docs/sdk.md
+++ b/docs/sdk.md
@@ -91,6 +91,32 @@ See [SECURITY.md](../SECURITY.md) for the full access model.
 
 `GET /health` — returns `{"status": "ok", "mutation_auth": "loopback"|"bearer"}` when the server is up (`mutation_auth` describes promote/rollback auth; see **HTTP API**).
 
+### `GET /v1/metrics` (no SDK wrapper)
+
+The metrics endpoint has no dedicated SDK method. Call it directly via `httpx` or `requests`:
+
+```python
+import httpx
+
+resp = httpx.get("http://127.0.0.1:8765/v1/metrics")
+resp.raise_for_status()
+counters = resp.json()
+# {
+#   "counters": {
+#     "releases_total": 3,
+#     "pricing_tables_total": 1,
+#     "run_events_total": 120,
+#     "promoted_pointers_total": 1,
+#     "actions_total": 5,
+#     "actions_by_action": {"promote": 4, "rollback": 1}
+#   },
+#   "schema_version": 3,
+#   "generated_at": "2026-05-03T12:00:00+00:00"
+# }
+```
+
+`GET /v1/metrics` is read-only and requires no token. See [http-api.md § GET /v1/metrics](http-api.md#get-v1metrics) for the full response shape.
+
 ### `list_releases() -> dict`
 
 `GET /v1/releases` — returns `{"releases": [...]}`. Each entry includes `release_id`,
diff --git a/docs/web-ui.md b/docs/web-ui.md
index 5863ac7..bfbfdb0 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -156,7 +156,11 @@ On submit, the raw diff response is parsed and rendered as:
 - **Pricing change warning:** when the diff response includes a `pricing` block with
   `pricing_or_model_changed: true`, a `fd-alert--warn` banner is shown in the summary
   card. It names the baseline and candidate provider/version/model so the user knows the
-  cost delta includes pricing assumption changes, not just usage changes.
+  cost delta includes pricing assumption changes, not just usage changes. When the response
+  also includes a `pricing.prices` block with all four per-1k token rates present, the
+  banner additionally shows a **Per-1k token prices** line (baseline → candidate, input and
+  output separately) so the user can separate tariff moves from token volume changes in the
+  cost delta. Rates are rendered to six decimal places via `toFixed(6)`.
 - **Metric cards:** cost/run (USD), latency avg (ms), error rate — each showing baseline,
   candidate, and delta.
 - **Raw diff JSON** panel (collapsed by default via `JsonPanel`).