diff --git a/docs/cli.md b/docs/cli.md index 5395127..9e4c557 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -437,7 +437,7 @@ JSONL (one event per line): {"api_version":"v1","type":"run_end","timestamp":"2026-05-01T12:01:00Z","agent_id":"agent_support","release_id":"rel_abc123",...} ``` -JSON array: +JSON array (detected automatically when the file content starts with `[`): ```json [ {"api_version":"v1","type":"run_end","timestamp":"2026-05-01T12:00:00Z",...}, @@ -445,6 +445,14 @@ JSON array: ] ``` +**Edge cases:** + +| Input | Outcome | +|-------|---------| +| Empty file (0 bytes or whitespace only) | Exits `0`, prints `Inserted 0 events`. Safe to ingest placeholder files. | +| Malformed JSONL (invalid JSON on any non-blank line) | Exits `1` with a parse error. Already-inserted rows from earlier lines remain. | +| JSON array file | Entire file parsed as a JSON array; blank lines inside the JSON are valid. | + See [http-api.md ยง POST /v1/events](http-api.md) for the full `RunEvent` field reference. --- diff --git a/docs/operations-and-policy.md b/docs/operations-and-policy.md index ebc5fbe..2638501 100644 --- a/docs/operations-and-policy.md +++ b/docs/operations-and-policy.md @@ -106,6 +106,28 @@ cost = (input_tokens / 1000) * input_usd_per_1k Runs are averaged across all events in the window to produce `cost_per_run_usd`. +### Multi-provider and cross-model diffs + +Baseline and candidate releases are costed **independently using their own +`spec.pricing_reference`**. This means: + +- A baseline on `openai/2024-02` and a candidate on `anthropic/2026-04` is a valid diff. +- A baseline using `gpt-4o` and a candidate using `gpt-4.1` on the same pricing table is + also valid, as long as both model names have entries in their respective pricing tables. + +When the provider, pricing version, or model differs between the two sides, the +`DiffOutcome` sets `pricing_or_model_changed = True`. This surfaces as: + +- **CLI:** `NOTE: cost delta includes pricing/model assumption changes (pricing reference + and/or model differ).` printed after the diff output. +- **HTTP API (`POST /v1/diff`):** `pricing.pricing_or_model_changed: true` in the response. +- **Web UI:** an `fd-alert--warn` callout on the Diff page naming both sides' pricing + references. + +The warning is informational โ€” the diff still completes and policy is still evaluated. +Treat the cost delta with caution when this flag is set, because price changes and model +changes are conflated in the delta value. + ### `compute_diff` vs. `promote_release` / `rollback_release`: filter scope `compute_diff` supports optional `tenant_id` and `task_id` filters in addition to diff --git a/docs/web-ui.md b/docs/web-ui.md index 568d1e1..a446fb6 100644 --- a/docs/web-ui.md +++ b/docs/web-ui.md @@ -153,6 +153,11 @@ On submit, the raw diff response is parsed and rendered as: - **Summary card:** policy badge (PASS / FAIL), failure reasons list, sample counts and confidence label. +- **Pricing/model-change callout:** when the diff response has + `pricing.pricing_or_model_changed === true`, a `fd-alert--warn` strip is shown below the + summary card. It displays the baseline and candidate + `provider/pricing_version model` pairs and notes that the cost delta includes pricing and + model assumption changes. This mirrors the `NOTE:` line in the CLI `release diff` output. - **Metric cards:** cost/run (USD), latency avg (ms), error rate โ€” each showing baseline, candidate, and delta. - **Raw diff JSON** panel (collapsed by default via `JsonPanel`). @@ -183,8 +188,19 @@ Both **Promote** and **Rollback** buttons are disabled while any request is in f any network call with an inline error. After a successful mutation: -1. The API response JSON is shown in a `JsonPanel` (open by default). -2. `notifyTimelineMutated()` is called, refreshing `OverviewPage` automatically. +1. A structured **outcome card** is rendered showing: + - Action type (Promotion / Rollback) and a **Policy PASS/FAIL badge**. + - **Pointer** badge โ€” `Updated` (green) when the promoted pointer changed, `Unchanged` + (neutral) when policy blocked the update. + - Metric tiles for **Action ID**, **Release**, **Baseline**, and **Reason**. + - Any policy failure reasons as a list under the metric grid. +2. The raw API response JSON is available in a collapsed `JsonPanel` below the outcome card. +3. `notifyTimelineMutated()` is called, refreshing `OverviewPage` automatically. + +The `pickOutcome` helper in `ActionsPage.tsx` coerces the `POST /v1/promote` or +`POST /v1/rollback` 200-response into an `ActionOutcomePayload`. If the response does not +match the documented contract (e.g. missing required fields), `pickOutcome` returns `null` +and only the raw JSON panel is shown. **Auth:** When `VITE_FLIGHTDECK_LOCAL_API_TOKEN` is set in the build environment (or `.env.local` during dev), `fetchJson` adds `Authorization: Bearer ` to every request. @@ -225,6 +241,28 @@ type HealthPayload = { /** Present on current servers; "bearer" when FLIGHTDECK_LOCAL_API_TOKEN is set. */ mutation_auth?: "bearer" | "loopback"; }; + +type PolicyResultPayload = { + passed: boolean; + reasons: string[]; + evaluated_at?: string; +}; + +/** + * Response shape for `POST /v1/promote` and `POST /v1/rollback` (HTTP 200). + * On HTTP 409 (policy blocked), the server wraps this in `{ detail: { message, outcome } }`. + * `promoted_pointer_changed` is `false` when blocked. + */ +type ActionOutcomePayload = { + action_id: string; + action: "promote" | "rollback"; + release_id: string; + agent_id: string; + environment: string; + baseline_release_id: string | null; + promoted_pointer_changed: boolean; + policy: PolicyResultPayload; +}; ``` ### `fetchJson(path, init?): Promise`