From 2047b637b2410b88c076ae0ecf6973c7af50ad5b Mon Sep 17 00:00:00 2001
From: Cursor Agent <cursoragent@cursor.com>
Date: Sat, 2 May 2026 12:02:05 +0000
Subject: [PATCH] docs: document validation edge cases, promote/rollback scope,
 actor resolution, env vars
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- docs/http-api.md: clarify POST /v1/events validation rules — empty events
  array returns HTTP 422 (Pydantic); api_version values other than 'v1' (empty
  string, null, wrong case, unknown strings) return HTTP 400 with a specific
  message; document that 'inserted' counts only newly written rows; add 422 to
  Errors table; note inconsistent agent_id within one side's events as a 400
  source on POST /v1/diff.

- docs/operations-and-policy.md: add 'compute_diff vs. promote/rollback: filter
  scope' subsection explaining that promote/rollback query events by environment
  only (no tenant_id or task_id filter), whereas compute_diff supports all three;
  expand 'cross-agent diffs' section to document the mixed-agent_id-within-events
  error and its cause; add the new error to the common errors table.

- docs/cli.md: document that flightdeck init does not require a pre-existing
  flightdeck.yaml (it is the exception to the 'all commands require config' rule);
  add 'Actor resolution' section documenting USER / USERNAME / 'unknown' fallback
  for CLI audit records and how the HTTP API actor field differs.

- DEVELOPMENT.md: add 'Environment variables' reference table covering
  FLIGHTDECK_LOCAL_API_TOKEN, FLIGHTDECK_USE_SYSTEM_TEMP, USER/USERNAME,
  VITE_FLIGHTDECK_LOCAL_API_TOKEN, VITE_DEV_PROXY_TARGET, and TMPDIR/TEMP/TMP.

Co-authored-by: Gottam Sai Bharath <Gsbreddy@users.noreply.github.com>
---
 DEVELOPMENT.md                | 11 +++++++++++
 docs/cli.md                   | 16 +++++++++++++++-
 docs/http-api.md              | 20 +++++++++++++++++---
 docs/operations-and-policy.md | 34 ++++++++++++++++++++++++++++++++++
 4 files changed, 77 insertions(+), 4 deletions(-)
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index aaa1a7f..2b1134d 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -187,3 +187,14 @@ virtual environment's Python executable directly:
 ```
 
 Use **`uv run python -m pytest`** from the repo root so imports like **`from tests.test_spine import …`** resolve the same way as in CI.
+
+## Environment variables
+
+| Variable | Component | Description |
+|----------|-----------|-------------|
+| `FLIGHTDECK_LOCAL_API_TOKEN` | Server | When set, `POST /v1/promote` and `POST /v1/rollback` require `Authorization: Bearer <token>`. Read endpoints and `POST /v1/events` are unaffected. See [docs/http-api.md](docs/http-api.md) and [SECURITY.md](SECURITY.md). |
+| `FLIGHTDECK_USE_SYSTEM_TEMP` | Tests | Set to `1` to force pytest to use the OS default temp directory instead of the repo-local `.tmp/` directory. Useful on developer machines where `%TEMP%` works correctly (see *Troubleshooting* above). |
+| `USER` / `USERNAME` | CLI | Used to populate the `actor` field on promote, rollback, and pricing import audit records. `USER` is checked first (Unix/macOS), then `USERNAME` (Windows); falls back to `"unknown"`. |
+| `VITE_FLIGHTDECK_LOCAL_API_TOKEN` | Web dev server | Build-time variable for the React UI dev server (Vite). Copy `web/.env.example` → `web/.env.local` to set it when testing mutations through `npm run dev` against a token-protected server. |
+| `VITE_DEV_PROXY_TARGET` | Web dev server | Overrides the Vite proxy target for `/v1` (default: `http://127.0.0.1:8765`). |
+| `TMPDIR` / `TEMP` / `TMP` | Tests / OS | Standard temp directory environment variables. Set any of these to a repo-local `.tmp/` path if the OS default is restricted or permissions cause pytest failures. |
diff --git a/docs/cli.md b/docs/cli.md
index 5ea1619..0d99ab1 100644
--- a/docs/cli.md
+++ b/docs/cli.md
@@ -16,7 +16,21 @@ serve` see [http-api.md](http-api.md).
 | `--help` | Print help for any command or subcommand |
 
 All commands require a `flightdeck.yaml` in the working directory (or the default path
-`./flightdeck.yaml`). Run `flightdeck init` to create one.
+`./flightdeck.yaml`). Run `flightdeck init` to create one. The only exception is
+`flightdeck init` itself — it writes the file and does not call `load_config`.
+
+## Actor resolution
+
+Several commands that write to the audit ledger (`release promote`, `release rollback`,
+`pricing import`) record an `actor` value. For CLI commands, `actor` is resolved from
+the environment at invocation time:
+
+1. `USER` environment variable (Unix / macOS)
+2. `USERNAME` environment variable (Windows)
+3. Falls back to `"unknown"` if neither is set
+
+The HTTP API's `POST /v1/promote` and `POST /v1/rollback` accept an explicit `"actor"`
+field in the request body (defaults to `"http"` when omitted).
 
 ## Exit codes
 
diff --git a/docs/http-api.md b/docs/http-api.md
index 232cbd8..6f340aa 100644
--- a/docs/http-api.md
+++ b/docs/http-api.md
@@ -176,16 +176,29 @@ Ingest `RunEvent` records (runtime evidence for diff and policy evaluation).
 }
 ```
 
-`api_version` may be omitted (defaults to `"v1"`). Any other value returns HTTP 400.
+`api_version` may be omitted (defaults to `"v1"`). Any other value — including `""`,
+`null`, wrong case like `"V1"`, or unknown strings — returns HTTP 400 with a message of
+the form `"Unsupported api_version for POST /v1/events: <value> (only 'v1' is accepted)."`.
+
 `run_id` must be unique per workspace; duplicates are silently ignored by storage.
 
+The `events` array must contain **at least one event**. An empty array (`"events": []`)
+is rejected by Pydantic validation with HTTP **422** before any event processing occurs.
+
 **Response**
 ```json
 {"inserted": 1}
 ```
 
+`inserted` is the count of **newly written** rows. Events with a `run_id` that already
+exists in storage are silently skipped; they do not increment `inserted` and do not
+produce an error.
+
 **Errors**
-- HTTP 400 — unsupported `api_version` or malformed `RunEvent` field.
+- HTTP 400 — unsupported `api_version` value, or a field in a `RunEvent` fails type/range
+  validation after the per-event `api_version` check.
+- HTTP 422 — `events` array is empty or the request body does not match the expected shape
+  (Pydantic validation error; returned as an array under `detail`).
 
 Full field reference: [`schemas/v1/run_event.schema.json`](../schemas/v1/run_event.schema.json).
 
@@ -316,7 +329,8 @@ Default thresholds (from `WorkspaceConfig.diff`): `min_candidate_runs=500`,
 `min_baseline_runs=500`, `min_low_runs=50`. Override per-workspace or via the active policy.
 
 **Errors**
-- HTTP 400 — unknown release ID, missing pricing table, cross-agent diff, or invalid
+- HTTP 400 — unknown release ID, missing pricing table, cross-agent diff (releases have
+  different `agent_id`), inconsistent `agent_id` within one side's run events, or invalid
   `window` format. The `detail` field describes the specific problem.
 
 ---
diff --git a/docs/operations-and-policy.md b/docs/operations-and-policy.md
index eefc9d3..ebc5fbe 100644
--- a/docs/operations-and-policy.md
+++ b/docs/operations-and-policy.md
@@ -106,12 +106,45 @@ cost = (input_tokens / 1000) * input_usd_per_1k
 
 Runs are averaged across all events in the window to produce `cost_per_run_usd`.
 
+### `compute_diff` vs. `promote_release` / `rollback_release`: filter scope
+
+`compute_diff` supports optional `tenant_id` and `task_id` filters in addition to
+`environment`. These allow you to narrow the evidence window to a specific tenant or task
+type when comparing releases.
+
+`_evaluate_promotion_or_rollback` (the shared path for `promote` and `rollback`) does
+**not** accept tenant or task filters. It queries run events for the entire environment
+over the window:
+
+```python
+# promote/rollback path — no tenant_id or task_id argument passed
+storage.query_runs(release_id, since, until, environment=environment)
+```
+
+This means **policy evaluation for promote/rollback aggregates all runs in the
+environment over the window**, regardless of tenant or task. The active policy applies to
+the full population of events for that release, not a filtered slice. If you need
+tenant-scoped evaluation, use `release diff` first to inspect the filtered evidence, then
+decide whether to promote.
+
 ### Important constraint: cross-agent diffs
 
 `compute_diff` checks that both releases have the same `agent_id` in their artifact
 spec *before* querying events. This is checked again inside `diff_releases` if run events
 from both sides are non-empty.
 
+`diff_releases` also enforces that all events on a given side share a single `agent_id`.
+If events for the baseline (or candidate) release span multiple agent IDs, the diff is
+rejected with:
+
+```
+Each side of the diff must have a single consistent agent_id among run events.
+```
+
+This can happen if `run_id` values from different agents were ingested under the same
+`release_id`. Ensure every `RunEvent` for a release carries the correct `agent_id`
+matching `spec.agent.agent_id` in the release artifact.
+
 ### Rollup semantics
 
 `ledger.compute_rollup` aggregates a list of `RunEvent` objects into a `Rollup`:
@@ -413,6 +446,7 @@ corresponding check in `test_schemas.py` (or `test_doctor.py`).
 | `Unknown baseline release: rel_...` | Release not registered | `flightdeck release register <path>` |
 | `Missing pricing table for baseline openai/2024-02` | Pricing not imported | `flightdeck pricing import <path>` |
 | `Cross-agent diff is not allowed` | Releases belong to different agents | Use releases from the same `agent_id` |
+| `Each side of the diff must have a single consistent agent_id among run events` | Ingested events for that release contain mixed `agent_id` values | Verify all `RunEvent` records use the correct `agent_id` matching the release artifact; re-ingest corrected events |
 | `Pricing table missing model entry` | Pricing table does not list the model used in the release | Add the model to the pricing YAML and reimport with `--replace` |
 | `Reason is required for promote/rollback actions` | Empty `--reason` flag | Provide a non-empty `--reason` |
 | `No promoted release exists for this agent/environment; nothing to roll back to` | Trying to roll back with no baseline | Promote a release first |