Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
78cd361
feat(09-01): session-derived tool-arg injection (FOC-01, FOC-02)
aksOps May 7, 2026
c0688b7
feat(10-01): mandatory per-turn confidence (FOC-03)
aksOps May 7, 2026
ee3c453
feat(11-01): pure-policy HITL gating + interrupt-vs-error fix (FOC-04)
aksOps May 7, 2026
be5d351
feat(12-01): framework-owned retry policy + v1.2 e2e genericity test …
aksOps May 7, 2026
7bb41c6
checkpoint: pre-yolo 2026-05-07T06:28:00
aksOps May 7, 2026
3ba099f
fix(v1.2): consolidate injection-path bug fixes from manual testing
aksOps May 7, 2026
faec93a
feat(13-01): LLM provider request_timeout + remove ollama.com fallbac…
aksOps May 7, 2026
fcc9435
docs(13-01): document embeddings/chat timeout asymmetry (WR-01)
aksOps May 7, 2026
19eca7b
feat(14-01): reproducible air-gap dependency lockfile (HARD-02)
aksOps May 7, 2026
a4c6be7
feat(16-01): bundler repair + CI staleness gate (BUNDLER-01, HARD-08)
aksOps May 7, 2026
3ccbd52
feat(15-01): real-LLM tool-loop termination via langchain.agents.crea…
aksOps May 7, 2026
18a090e
feat(17-01): thread-safe singleton + clean watchdog cancellation (HAR…
aksOps May 7, 2026
f5978a3
refactor(18-01): silent-failure sweep with logging + ratchet test (HA…
aksOps May 7, 2026
e060232
feat(19-01): pyright CI gate flip to fail-on-error (HARD-03)
aksOps May 7, 2026
9dd3ad9
feat(20-01): UI test scaffolding for src/runtime/ui.py (HARD-09)
aksOps May 7, 2026
0234d41
feat(21-01): skill-prompt-vs-schema linter + CI gate (SKILL-LINTER-01)
aksOps May 7, 2026
84f52bb
fix: clear ruff baseline before per-step telemetry work
aksOps May 12, 2026
9b31b22
feat(telemetry): M1 wire EventLog into orchestrator boot
aksOps May 12, 2026
f706759
feat(telemetry): M2 add EventKind literal + record() helper
aksOps May 12, 2026
4f196f2
feat(telemetry): M3 emit per-step events at tool-call + agent boundaries
aksOps May 12, 2026
892a2e0
feat(telemetry): M4 emit status_changed in finalize path
aksOps May 12, 2026
a998217
feat(telemetry): M5 LessonStore + LessonExtractor for past-resolution…
aksOps May 12, 2026
48d7b31
feat(telemetry): M6 intake reads lessons + finalize writes them
aksOps May 12, 2026
2021e17
feat(telemetry): M7 nightly LessonRefresher via APScheduler
aksOps May 12, 2026
2f091a1
feat(telemetry): M8 Ollama-via-LangChain config + smoke
aksOps May 12, 2026
999d308
feat(telemetry): M9 end-to-end ratchet + soft-delete suppression
aksOps May 12, 2026
5c65d79
checkpoint: pre-yolo 2026-05-13T00:24:30
aksOps May 13, 2026
09c5d87
chore(coverage): omit dist/UI scaffolding from coverage gate
aksOps May 13, 2026
e7a9211
feat(api): React-readiness — generic /sessions/* + SSE + WebSocket + …
aksOps May 13, 2026
e9171f7
checkpoint: pre-yolo 2026-05-13T01:35:26
aksOps May 13, 2026
ff1133d
test(api): close gap-tests — resume + retry SSE + retry/preview happy…
aksOps May 13, 2026
688d33e
fix(security+ci): clear CodeQL high-severity + Lint dummy-env failures
aksOps May 13, 2026
a8c2f6f
fix(ci): empty API keys so live-smoke tests skip cleanly
aksOps May 13, 2026
694bbf0
test(api): cover SSE/WS error envelopes + lesson_store None paths
aksOps May 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .claude/worktrees/agent-a5e8856c1b01a8d2f
Submodule agent-a5e8856c1b01a8d2f deleted from 7ae577
1 change: 0 additions & 1 deletion .claude/worktrees/agent-ad51a9f71a5268747
Submodule agent-ad51a9f71a5268747 deleted from ae0ee4
80 changes: 67 additions & 13 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,25 +21,79 @@
uses: actions/setup-python@v6.2.0
with:
python-version: "3.11"
cache: "pip"

- name: Install dependencies
run: pip install -e ".[dev]"
- name: Set up uv
uses: astral-sh/setup-uv@v6

Check warning

Code scanning / CodeQL

Unpinned tag for a non-immutable Action in workflow Medium

Unpinned 3rd party Action 'CI' step
Uses Step
uses 'astral-sh/setup-uv' with ref 'v6', not a pinned commit hash
with:
# Pin uv version for reproducible CI; bump deliberately when bumping locally.
version: "0.11.7"
enable-cache: true

- name: Lockfile freshness gate (HARD-02)
# Fails the build if pyproject.toml drifts from uv.lock — no silent
# resolves on CI, no surprise transitive upgrades. Phase 14 / SC-4.
run: uv lock --check

- name: Install dependencies (from lockfile)
# `--frozen` forbids re-resolving; uv installs the exact set pinned in
# uv.lock with hash verification. Phase 14 / SC-3.
run: uv sync --frozen --extra dev

- name: Bundle staleness gate (HARD-08)
# Regenerates dist/* from src/runtime + examples/* and fails the
# build if anything in dist/ would change. Forces every PR that
# touches src/runtime, examples/, or the bundler to commit fresh
# bundles — the air-gap deploy bundle stays repaired by
# construction (Phase 16 / BUNDLER-01 + HARD-08). Contributors
# run `python scripts/build_single_file.py` before every push;
# see docs/DEVELOPMENT.md.
run: |
uv run python scripts/build_single_file.py
git diff --exit-code dist/

- name: Lint (ruff)
run: ruff check src/ tests/
run: uv run ruff check src/ tests/

- name: Type check (pyright)
# Pyright was previously pointed at src/orchestrator (a shim layer
# of star-imports) so its real coverage of the framework was nil.
# After deleting src/orchestrator, the target moved to src/runtime
# and surfaces ~41 pre-existing generic/typed-dict issues. Don't
# block the build on those; track via the follow-up cleanup plan.
continue-on-error: true
run: pyright src/runtime
- name: Type check (pyright) (HARD-03)
# Phase 19 -- the gate is now fail-on-error against ``src/runtime``.
# The earlier 54-error backlog was resolved via type-annotation
# tightening + per-line ``# pyright: ignore[<rule>] -- <rationale>``
# comments for legitimate stub gaps. ``pyproject.toml`` carries
# the ``[tool.pyright]`` block (``include = ["src"]``,
# ``extraPaths = ["src"]``, ``typeCheckingMode = "basic"``).
# Test files and ``dist/`` bundles are out of scope for this
# phase; future phases may extend coverage outward.
run: uv run pyright src/runtime

- name: Test with coverage
run: pytest --cov=src/runtime --cov-report=xml --junitxml=junit.xml
# Dummy env vars satisfy the strict ``_interpolate`` check that
# config.yaml's ``${OLLAMA_API_KEY}`` / ``${OPENROUTER_API_KEY}``
# placeholders trigger when ``load_config()`` runs. Tests don't
# call live providers; values just need to exist. Live smoke
# tests are gated separately by ``OLLAMA_LIVE=1``.
env:
# Empty API keys so live-provider smoke tests gated by
# ``if not os.environ.get(KEY)`` correctly skip. The
# ``_interpolate`` strict-mode check only requires the var
# to EXIST in the environment (any value, incl. empty).
OLLAMA_API_KEY: ""
OPENROUTER_API_KEY: ""
AZURE_OPENAI_KEY: ""
AZURE_DEPLOYMENT: ""
AZURE_ENDPOINT: https://ci-dummy.example/
EXTERNAL_MCP_URL: https://ci-dummy.example/
EXT_TOKEN: ci-dummy
run: uv run pytest --cov=src/runtime --cov-report=xml --junitxml=junit.xml

- name: Skill-prompt-vs-schema lint (SKILL-LINTER-01)
# Phase 21. Walks every examples/*/skills/*/system.md and asserts
# that every referenced tool name + arg field exists in the
# canonically discovered tool inventory (AST-extracted from
# examples/*/mcp_server*.py + mcp_servers/*.py) and the typed
# patch models (UpdateIncidentPatch). Catches LLM-emit-vs-schema
# drift like `findings_triage` vs `findings.triage`, hallucinated
# injected args, and unknown tool names. Binary-pass gate.
run: uv run python scripts/lint_skill_prompts.py

- name: SonarCloud Scan
uses: SonarSource/sonarqube-scan-action@v8.0.0
Expand Down
13 changes: 12 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,21 @@ Thumbs.db
# --- Claude tooling artifacts ----------------------------------------
AGENTS.md
ASR.md
docs/
.claude/ralph-loop.local.md
.claude/worktrees/
.plan/
# Tracked docs are explicitly listed below; everything else under docs/
# is Claude scratch (plans, brainstorm output, etc) and stays gitignored.
# - AIRGAP_INSTALL.md: Phase 14 (HARD-02) air-gap install path.
# - DEVELOPMENT.md: Phase 16 (BUNDLER-01) contributor workflow.
docs/*
!docs/AIRGAP_INSTALL.md
!docs/DEVELOPMENT.md
REVIEW_*.md
review_*.md
.planning/
# Dev integration test driver (out-of-repo tool, runs against live UI).
scripts/integration_scenarios.py

# Coverage / CI artefacts
coverage.xml
Expand Down
75 changes: 75 additions & 0 deletions .planning/phases/14-reproducible-air-gap-lockfile/14-01-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
phase: 14-reproducible-air-gap-lockfile
plan: 01
title: Reproducible air-gap dependency lockfile (HARD-02)
status: in_progress
date: 2026-05-07
requirement: HARD-02 (CONCERNS C2)
---

# Plan 14-01 — Reproducible Air-Gap Dependency Lockfile

## One-liner

Commit a `uv.lock` that pins every transitive dependency with hashes; CI installs from the lockfile and a freshness gate fails the build when `pyproject.toml` drifts from `uv.lock`; document the offline install path so an engineer behind a corporate firewall can reproduce the dependency graph from an internal mirror without public-internet access.

## Tool Selection — `uv` (rationale)

Considered `uv`, `pip-tools`, `poetry`. Selected **`uv`** (locally installed: `uv 0.11.7`).

| Criterion (`~/.claude/rules/dependencies.md`) | `uv` | `pip-tools` | `poetry` |
| --- | --- | --- | --- |
| License | Apache-2.0 / MIT (dual) | BSD-3-Clause | MIT |
| Active maintenance / bus factor | Astral team, daily releases | jazzband collective | python-poetry org |
| Lockfile format | `uv.lock` (TOML, hashes per platform marker) | `requirements.txt` w/ `--generate-hashes` | `poetry.lock` (TOML) |
| PEP 621 (`pyproject.toml` `[project]`) native | Yes — already what we use | Reads `pyproject.toml` direct | Requires `[tool.poetry]` rewrite of `[project]` |
| Resolver speed (171 pkgs) | ~14 ms (measured) | seconds | seconds |
| Single static binary | Yes (Rust) | No (Python pkg) | No (Python pkg) |
| Works fully offline (`--offline`, `--frozen`) | Yes (first-class) | Indirect via `pip install --no-index` | Yes |
| Drift gate (`--check`) | `uv lock --check` | `pip-compile --check` (since 7.4) | `poetry check --lock` |
| Already adopted in repo | **Yes** (`uv.lock` already present, 4430 lines, 171 pkgs) | No | No |

**Decision:** `uv`. The lockfile already exists in-repo and is in sync (`uv lock --check` exits 0 in 14 ms). `poetry` is rejected because adopting it would require rewriting `[project]` into `[tool.poetry]` — a pyproject-format migration that violates "minimal diff" scope. `pip-tools` would lose the `uv.lock` work already present and forfeit the multi-platform marker pinning that `uv.lock` gives for free.

## Tasks (8)

1. **Confirm lockfile freshness against current `pyproject.toml`** — `uv lock --check` (already passes; recorded as baseline).
2. **Add `[tool.uv]` block to `pyproject.toml` if needed** — likely no-op; defaults already satisfy our needs. Verify behaviour.
3. **Rewrite CI install step in `.github/workflows/ci.yml`** — replace `pip install -e ".[dev]"` with `uv sync --frozen --extra dev`, plus `astral-sh/setup-uv@v6` for the runner.
4. **Add CI lockfile-freshness gate** — new step `uv lock --check` runs before install; fails CI when `pyproject.toml` and `uv.lock` drift.
5. **Switch CI test/lint/type-check steps to `uv run`** — `uv run pytest …`, `uv run ruff check …`, `uv run pyright …` so tools execute against the locked virtualenv.
6. **Document the offline install path** — new `docs/AIRGAP_INSTALL.md` (≤50 lines): clone, `UV_INDEX_URL=https://internal-mirror`, `uv sync --frozen --offline`, `uv run pytest tests/ -x`.
7. **Local verification (acceptance gates)**:
- `uv lock --check` → exit 0
- `python -m pytest tests/ -x` → all collected tests pass (baseline 1047)
- `ruff check src tests` → unchanged from baseline (13 pre-existing errors — NOT regressed)
- `pyright src/runtime` → unchanged from baseline (54 pre-existing errors — NOT regressed)
- `python scripts/build_single_file.py && git diff --exit-code dist/` → clean
- `git grep -nE 'https://ollama\.com|ollama\.com/api' -- src/` → zero matches (HARD-05 ratchet)
- `python -c 'import yaml; yaml.safe_load(open(".github/workflows/ci.yml"))'` → no parse error (no local yamllint installed)
8. **Single atomic commit** on `refactor/framework-flow-control` per phase precedent.

## Files Touched

| File | Status | Why |
| --- | --- | --- |
| `pyproject.toml` | possibly add `[tool.uv]` block (else unchanged) | UV config / extras declaration |
| `uv.lock` | **already present, unchanged** | Pre-existing; freshness re-verified at commit time |
| `.github/workflows/ci.yml` | modified | Install via `uv sync --frozen`; add lockfile-freshness gate; run tools via `uv run` |
| `docs/AIRGAP_INSTALL.md` | NEW | Offline install instructions |
| `.planning/phases/14-reproducible-air-gap-lockfile/14-01-PLAN.md` | NEW | This file |
| `.planning/phases/14-reproducible-air-gap-lockfile/14-01-SUMMARY.md` | NEW | After-action |
| `.planning/phases/14-reproducible-air-gap-lockfile/14-VERIFICATION.md` | NEW | Per-success-criterion gates |

## Out of Scope (deferred)

- **Vendored wheels tarball** for true `--no-index` install — separate phase (called out in 14-CONTEXT.md `Deferred Ideas`).
- **`Makefile` / `make bootstrap`** scaffolding — ROADMAP SC-2 wording mentions `make bootstrap` "or equivalent"; the equivalent is `uv sync --frozen [--offline]`. Documented in `docs/AIRGAP_INSTALL.md`.
- **Pyright / ruff baseline cleanup** — existing pre-Phase-14 baselines preserved exactly; not a Phase 14 concern.

## Hard-Stop Triggers (HALT, write BLOCKER.md)

- `uv lock --check` reports drift after commit → root-cause and stop.
- Any test in `tests/` newly fails with the lockfile-driven install AND root cause is the lockfile.
- CI YAML edits don't validate as YAML.
- `dist/*` regen produces a non-empty `git diff` after Phase 14 changes.
83 changes: 83 additions & 0 deletions .planning/phases/14-reproducible-air-gap-lockfile/14-01-SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
status: completed
phase: 14-reproducible-air-gap-lockfile
plan: 01
subsystem: build / ci / dependencies
tags: [hardening, air-gap, build, ci, lockfile]
requires: [phase-13-llm-provider-hardening]
provides: [uv.lock-CI-install, uv-lock-check-freshness-gate, docs/AIRGAP_INSTALL.md]
affects: [pyproject.toml, .github/workflows/ci.yml, .gitignore, docs/AIRGAP_INSTALL.md, uv.lock]
tech-stack:
added: [uv (Apache-2.0/MIT, single static binary, Astral)]
patterns: [pin+hash transitive lockfile, --frozen install, lockfile-drift CI gate]
key-files:
created:
- docs/AIRGAP_INSTALL.md
modified:
- .github/workflows/ci.yml
- .gitignore
unchanged-but-canonical:
- pyproject.toml # already PEP 621; no [tool.uv] needed
- uv.lock # already in sync (uv lock --check exit 0)
decisions:
- "Tool: uv 0.11.7 (Apache-2.0/MIT). Picked over pip-tools (loses uv.lock investment, no per-marker pinning) and poetry (would require [project] -> [tool.poetry] rewrite, violates minimal diff)."
- "uv.lock already exists (171 packages, 4430 lines, in sync per `uv lock --check`); Phase 14 wires CI to install from it, adds the freshness gate, and documents the offline path. No new lockfile generation required."
- "CI install: `uv sync --frozen --extra dev` (replaces `pip install -e .[dev]`). `--frozen` forbids re-resolving."
- "CI lockfile-drift gate: `uv lock --check` runs as the FIRST step inside the job (before install) so a stale uv.lock fails the build before anything else."
- "Tools (ruff, pyright, pytest) run via `uv run` so they execute against the locked virtualenv."
- "Pinned uv version 0.11.7 in CI (matches local) — bumps are deliberate, not silent."
- "Documented offline path in `docs/AIRGAP_INSTALL.md` (38 lines): clone -> UV_INDEX_URL=internal-mirror -> `uv sync --frozen [--offline]`. Negation rule added to .gitignore so docs/AIRGAP_INSTALL.md is the single shipped doc."
- "Single atomic commit per phase precedent (Phase 9-13)."
metrics:
duration: "~15 min"
tasks-completed: 8
files-touched: 4 # (1 new, 2 modified, 1 planning .md whitelisted)
tests-added: 0 # pure infra, no new test surface
tests-total: 1044 # (1044 passed, 3 skipped — same as Phase 13)
ratchet-status: green
bundle-determinism: deterministic (`git diff --exit-code dist/` clean after regen)
gates:
uv-lock-check: "Resolved 171 packages in 2ms — exit 0"
yaml-valid: "9 steps, parses clean"
ollama-grep-src: "0 matches (HARD-05 ratchet preserved)"
ruff: "13 errors (pre-Phase-14 baseline, unchanged)"
pyright-runtime: "54 errors (pre-Phase-14 baseline, unchanged)"
pyright-full: "329 errors (pre-Phase-14 baseline, unchanged)"
dist-regen-diff: "clean (exit 0)"
pytest: "1044 passed, 3 skipped"
---

# Phase 14 Plan 01 Summary — Reproducible Air-Gap Dependency Lockfile

## One-liner

Wired the existing in-repo `uv.lock` into CI via `uv sync --frozen`, added a `uv lock --check` lockfile-freshness gate that fails the build on `pyproject.toml`/`uv.lock` drift, and documented the offline install path in `docs/AIRGAP_INSTALL.md` so an engineer behind a corporate firewall can reproduce the exact dependency graph from an internal mirror without public-internet access. Closes HARD-02 (CONCERNS C2).

## What changed

| File | Change |
| --- | --- |
| `.github/workflows/ci.yml` | Added `astral-sh/setup-uv@v6` (uv 0.11.7); added `uv lock --check` gate as first job step; replaced `pip install -e ".[dev]"` with `uv sync --frozen --extra dev`; rewrote `ruff` / `pyright` / `pytest` invocations as `uv run …` so they hit the locked venv. |
| `docs/AIRGAP_INSTALL.md` (new) | 38-line offline-install recipe: clone → set `UV_INDEX_URL` → `uv sync --frozen [--offline]` → `uv run pytest tests/ -x`. |
| `.gitignore` | Added `!docs/AIRGAP_INSTALL.md` negation so the air-gap install doc ships while the rest of `docs/` (Claude artefacts) stays ignored. |
| `pyproject.toml` | Unchanged — already PEP 621; uv reads `[project]` natively, no `[tool.uv]` block required. |
| `uv.lock` | Unchanged — already present, 4430 lines, 171 packages, in sync. Verified by `uv lock --check` exit 0. |

## Acceptance gates (all green)

```
uv lock --check : EXIT 0 (171 pkgs, 2 ms)
python -c 'import yaml; yaml.safe_load(open(ci.yml))' : 9 steps, parses
git grep -nE 'https://ollama\.com|ollama\.com/api' src/ : 0 matches (HARD-05 ratchet)
ruff check src tests : 13 errors (pre-existing baseline)
pyright src/runtime : 54 errors (pre-existing baseline)
pyright : 329 errors (pre-existing baseline)
python scripts/build_single_file.py && git diff dist/ : clean (exit 0)
pytest tests/ -x : 1044 passed, 3 skipped
```

## Out of scope (deferred)

- A vendored-wheels tarball (truly `--no-index` install kit) — separate phase.
- Pyright / ruff baseline cleanup — pre-existing baselines, not Phase 14 territory.
- `Makefile` `make bootstrap` shim — `uv sync --frozen [--offline]` is the documented equivalent (ROADMAP SC-2 wording allows "or equivalent").
Loading
Loading