Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Extended maintainer docs (research workflow, org checklist, canonical publish) l

## Mission

FlightDeck is AI Release Governance for production agents. The core product promise is
trustworthy release safety: version releases, ingest runtime evidence, compare diffs, and gate
FlightDeck helps teams **ship AI agents safely** with **release diffs**, **runtime evidence**, and **policy gates**.
The core product promise is trustworthy release safety: version releases, ingest runtime evidence, compare diffs, and gate
promotion with policy.

## Current Wedge
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ This project follows [Semantic Versioning](https://semver.org/). From **v1.0.0**
### Changed

- **Examples / CI snippets:** **`flightdeck-ai>=1.1.1`** where version pins apply.
- **Positioning:** README, PyPI short description, CLI `--help`, and web header tagline emphasize outcome-oriented messaging (diffs, evidence, policy gates) plus README sections for stack fit and product comparisons.

## 1.1.0 - 2026-05-03

Expand Down
72 changes: 56 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,67 @@
# FlightDeck

FlightDeck is **AI Release Governance** for production agents.
**Ship AI agents safely with release diffs, runtime evidence, and policy gates.**

FlightDeck is **local-first** (CLI + SQLite + optional **`flightdeck serve`** UI). It is not an agent framework, prompt IDE, tracing dashboard, or gateway — it is where **what shipped**, **what ran**, **what it cost**, and **whether promote is allowed** are recorded and compared.

## In ~20 seconds

1. **Register** immutable agent releases (`release.yaml` + bundle checksum).
2. **Ingest** run evidence (`RunEvent` JSONL or **`POST /v1/events`**).
3. **Diff** baseline vs candidate: cost, latency, errors, and confidence (optional **pricing catalog** lines on top).
4. **Promote** only when policy passes; optional **human approval** (request → confirm) before the ledger moves.

## Example outcome

You ship a candidate whose **system prompt drifts by a handful of tokens**; under your imported tariffs the diff shows **cost per run up ~31%** while policy caps spend. **`flightdeck release promote`** (or the HTTP promote path) **stays blocked** until you change the model, relax policy with intent, or widen evidence — not because CI is slow, but because the **governed ledger** says no.

## Who should use this?

- Teams that **version agent builds** (prompts, tools, model pins) and need a **durable audit trail**.
- Engineers who want **one command** to answer “is this candidate safe to roll forward?” with **numbers**, not gut feel.
- Anyone who has outgrown **ad hoc** folder diffs or **spreadsheet** promote checklists.

## How FlightDeck fits your stack

FlightDeck sits **next to** your agent runtime (not in the inference hot path): emit evidence, run **`flightdeck`** from a laptop or CI, gate **promote** with policy (and optional approval).

```mermaid
flowchart LR
subgraph runtime [Your agent runtime]
agent[Agent or service]
end
subgraph fd [FlightDeck workspace]
ingest[Ingest RunEvents]
ledger[(SQLite ledger)]
diff[release diff]
promote[promote or rollback]
end
subgraph automation [Automation]
ci[CI job or operator]
end
agent -->|"JSONL or HTTP events"| ingest
ingest --> ledger
ledger --> diff
diff --> ci
ci -->|"policy pass"| promote
```

It gives teams a local-first control loop for release safety: register immutable agent
releases, ingest runtime evidence, compare trusted diffs, and gate promotion with policy.
## Comparison at a glance

FlightDeck is not an agent framework, prompt IDE, tracing dashboard, or gateway. It is the
operating record for what changed, what it costs, how it behaves, and whether it is safe to
promote.
| | **FlightDeck** | **Langfuse** | **Arize Phoenix / Cloud** | **Git / CI alone** |
|--|----------------|----------------|---------------------------|---------------------|
| **Primary job** | **Release + promote governance** for agents (ledger, diff, policy) | Tracing, sessions, evals, LLM observability | ML / model observability and monitoring | Source control and generic pipelines |
| **Immutable release artifact** | Yes (`release.yaml` + checksum) | No | No | Only if you build it |
| **Evidence + cost/latency diff** | Yes (runs + pricing tables / optional catalog) | Different lens (trace-level) | Different lens | DIY |
| **Policy gate on promote** | First-class | No | No | DIY |

## Why It Exists
**Try the UI:** run **`flightdeck serve`**, then open **http://127.0.0.1:8765/** — Overview, Diff, and Actions (see [docs/web-ui.md](docs/web-ui.md)).

AI agent changes can silently alter cost, latency, failure rate, and unit economics. FlightDeck
turns those changes into explicit release decisions backed by runtime evidence.
## Why it exists

Current local spine:
Small prompt or model changes can silently move **cost**, **latency**, and **error rate**. FlightDeck turns those moves into **explicit promote decisions** backed by ingested runs — before production pointers advance.

- versioned `release.yaml` artifacts with bundle checksums
- `RunEvent` ingestion from JSONL or JSON arrays
- immutable pricing tables with explicit `--replace`
- trusted `flightdeck release diff`
- policy-gated `flightdeck release promote`
- promotion decision history
**Current local spine:** versioned **`release.yaml`** + checksums · **`RunEvent`** ingest (JSONL or arrays) · immutable **pricing** imports · **`flightdeck release diff`** · policy-gated **`release promote`** / rollback · full **audit history**.

## Status

Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Roadmap

FlightDeck is **AI release governance** for production agents: immutable releases, runtime evidence, trusted diffs, and policy-gated promotion.
FlightDeck helps teams **ship AI agents safely** with release diffs, runtime evidence, and policy gates: immutable releases, runtime evidence, trusted diffs, and policy-gated promotion.

This roadmap is meant to be clear from **what is already shipped** to **near-term commitments** and **long-horizon possibilities**. It also calls out what still makes the product feel standalone in production settings.

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ build-backend = "hatchling.build"
[project]
name = "flightdeck-ai"
version = "1.1.1"
description = "AI Release Governance for production agents."
description = "Ship AI agents safely with release diffs, runtime evidence, and policy gates."
readme = "README.md"
license = "Apache-2.0"
requires-python = ">=3.14,<3.15"
Expand Down
2 changes: 1 addition & 1 deletion src/flightdeck/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""FlightDeck - AI Release Governance for production agents."""
"""FlightDeck — ship AI agents safely with release diffs, runtime evidence, and policy gates."""

__version__ = "1.1.1"
4 changes: 2 additions & 2 deletions src/flightdeck/cli/main.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""FlightDeck CLI - AI Release Governance."""
"""FlightDeck CLI — release diffs, runtime evidence, policy gates."""

from __future__ import annotations

Expand Down Expand Up @@ -68,7 +68,7 @@ def parse_events_file(path: Path) -> list[RunEvent]:
@click.group()
@click.version_option(version=__version__, prog_name="flightdeck")
def cli() -> None:
"""FlightDeck - AI Release Governance (release safety ledger + trustworthy diffs)."""
"""Ship AI agents safely — release diffs, runtime evidence, policy gates."""


@cli.command()
Expand Down

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/flightdeck/server/static/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>FlightDeck</title>
<script type="module" crossorigin src="/assets/index-B6DTQYWv.js"></script>
<script type="module" crossorigin src="/assets/index-DjScmcgK.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-Dl91dBdu.css">
</head>
<body>
Expand Down
2 changes: 1 addition & 1 deletion tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def test_cli_help() -> None:

assert result.exit_code == 0
assert "FlightDeck" in result.output
assert "AI Release Governance" in result.output
assert "Ship AI agents safely" in result.output


def test_cli_version() -> None:
Expand Down
2 changes: 1 addition & 1 deletion web/src/components/AppShell.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ export function AppShell() {
<header className="fd-header">
<div className="fd-header__brand">
<h1 className="fd-header__title">FlightDeck</h1>
<p className="fd-header__tagline">Local release governance</p>
<p className="fd-header__tagline">Diffs, evidence, policy gates</p>
</div>
<nav className="fd-nav" aria-label="Primary">
<NavLink to="/" end className={navCls}>
Expand Down
Loading