Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 50 additions & 5 deletions docs/operations-and-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,16 +154,61 @@ following differ between baseline and candidate:
- `spec.pricing_reference.pricing_version` (e.g. `"openai-2026-04-30"` vs. a newer table)
- `spec.runtime.model` (e.g. `"gpt-4.1-mini"` vs. `"gpt-4.1"`)

When this flag is `True`, the CLI prints a note:
`DiffOutcome` also carries the resolved per-1k token rates for each side directly:

| Field | Description |
|-------|-------------|
| `baseline_input_usd_per_1k_tokens` | Input rate from the baseline pricing table entry (or `None` when not found) |
| `baseline_output_usd_per_1k_tokens` | Output rate from the baseline pricing table entry (or `None`) |
| `baseline_cached_input_usd_per_1k_tokens` | Cached-input rate for baseline (or `None` when not set in the table) |
| `candidate_input_usd_per_1k_tokens` | Input rate from the candidate pricing table entry (or `None`) |
| `candidate_output_usd_per_1k_tokens` | Output rate from the candidate pricing table entry (or `None`) |
| `candidate_cached_input_usd_per_1k_tokens` | Cached-input rate for candidate (or `None`) |

These fields are populated by `pricing_entry_for(table, model)` in `flightdeck.ledger` after
`diff_releases` returns and before the `DiffOutcome` is constructed.

**CLI output** — when `pricing_or_model_changed` is `True`, the CLI prints:

```
NOTE: cost delta includes pricing/model assumption changes (pricing reference and/or model differ).
Per-1k token prices: input 0.005000 -> 0.004500, output 0.015000 -> 0.013500
```

The HTTP API's `/v1/diff` response includes `pricing.pricing_or_model_changed: true` in the
`pricing` block, and the web UI's `DiffPage` shows an `fd-alert--warn` banner. This is an
informational signal — the diff still computes and the policy still evaluates; cost deltas may
reflect pricing assumption changes in addition to actual usage changes.
The **Per-1k token prices** line is only printed when both input and output rates are present
for both sides. If any rate is `None`, that line is omitted.

**HTTP API** — `/v1/diff` includes a `pricing.prices` object alongside the existing
`pricing_or_model_changed` flag:

```json
"pricing": {
"baseline_provider": "openai",
"baseline_version": "2024-02",
"baseline_model": "gpt-4o",
"candidate_provider": "openai",
"candidate_version": "2024-05",
"candidate_model": "gpt-4o",
"pricing_or_model_changed": true,
"prices": {
"baseline_input_usd_per_1k_tokens": 0.005,
"baseline_output_usd_per_1k_tokens": 0.015,
"baseline_cached_input_usd_per_1k_tokens": null,
"candidate_input_usd_per_1k_tokens": 0.0045,
"candidate_output_usd_per_1k_tokens": 0.0135,
"candidate_cached_input_usd_per_1k_tokens": null
}
}
```

`pricing.prices` is always present in the response (not gated on `pricing_or_model_changed`).
Fields are `null` when the rate is not set in the pricing table.

**Web UI** — the `DiffPage` `fd-alert--warn` banner shows the per-1k input/output price deltas
(baseline → candidate) when all four rates are present. See [web-ui.md § DiffPage](web-ui.md).

This is an informational signal — the diff still computes and the policy still evaluates; cost
deltas may reflect pricing assumption changes in addition to actual usage changes.

Cross-provider diffs (e.g. OpenAI baseline vs. Anthropic candidate) are supported as long as
separate pricing tables for each provider/version are imported. Each side is priced against its
Expand Down
26 changes: 26 additions & 0 deletions docs/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,32 @@ See [SECURITY.md](../SECURITY.md) for the full access model.

`GET /health` — returns `{"status": "ok", "mutation_auth": "loopback"|"bearer"}` when the server is up (`mutation_auth` describes promote/rollback auth; see **HTTP API**).

### `GET /v1/metrics` (no SDK wrapper)

The metrics endpoint has no dedicated SDK method. Call it directly via `httpx` or `requests`:

```python
import httpx

resp = httpx.get("http://127.0.0.1:8765/v1/metrics")
resp.raise_for_status()
counters = resp.json()
# {
# "counters": {
# "releases_total": 3,
# "pricing_tables_total": 1,
# "run_events_total": 120,
# "promoted_pointers_total": 1,
# "actions_total": 5,
# "actions_by_action": {"promote": 4, "rollback": 1}
# },
# "schema_version": 3,
# "generated_at": "2026-05-03T12:00:00+00:00"
# }
```

`GET /v1/metrics` is read-only and requires no token. See [http-api.md § GET /v1/metrics](http-api.md#get-v1metrics) for the full response shape.

### `list_releases() -> dict`

`GET /v1/releases` — returns `{"releases": [...]}`. Each entry includes `release_id`,
Expand Down
6 changes: 5 additions & 1 deletion docs/web-ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,11 @@ On submit, the raw diff response is parsed and rendered as:
- **Pricing change warning:** when the diff response includes a `pricing` block with
`pricing_or_model_changed: true`, a `fd-alert--warn` banner is shown in the summary
card. It names the baseline and candidate provider/version/model so the user knows the
cost delta includes pricing assumption changes, not just usage changes.
cost delta includes pricing assumption changes, not just usage changes. When the response
also includes a `pricing.prices` block with all four per-1k token rates present, the
banner additionally shows a **Per-1k token prices** line (baseline → candidate, input and
output separately) so the user can separate tariff moves from token volume changes in the
cost delta. Rates are rendered to six decimal places via `toFixed(6)`.
- **Metric cards:** cost/run (USD), latency avg (ms), error rate — each showing baseline,
candidate, and delta.
- **Raw diff JSON** panel (collapsed by default via `JsonPanel`).
Expand Down
Loading