Fuiste · Fuiste · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026
diff --git a/.codex/skills/qa-night-shift/SKILL.md b/.codex/skills/qa-night-shift/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: qa-night-shift
-description: Use when the user wants to QA test Night Shift against a user-specified scratch repo path, install the current worktree CLI, and run an approval-gated real-provider pass to validate init/plan/start/status/report/resolve/resume behavior.
+description: Use when the user wants to QA test Night Shift against a user-specified scratch repo path, install the current worktree CLI, and run an approval-gated real-provider pass to validate init/plan/start/status/report/provenance/doctor/resolve/resume behavior.
 ---
 
 # QA Night Shift
@@ -59,7 +59,10 @@ real inference spend.
 - If it does look like an intentional testing target, proceed.
 - Even for an obvious scratch repo, do not run `night-shift plan`,
   `night-shift start`, `night-shift resume`, or other inference-consuming QA
-  steps until the user approves the presented plan.
+  steps until the user approves the presented plan. Read-only checks such as
+  `night-shift status`, `night-shift report`, `night-shift provenance`,
+  `night-shift doctor`, or `night-shift resume --explain` are acceptable once
+  the user-approved QA pass reaches the relevant state.
 
 Do not quietly assume a normal product repo is safe to use for QA.
 
@@ -123,7 +126,10 @@ Typical flow:
 5. inspect `night-shift status`
 6. run `night-shift start`
 7. inspect `night-shift report`
-8. use `night-shift resolve` or `night-shift resume` only if the run actually
+8. inspect `night-shift provenance`
+9. use `night-shift doctor` or `night-shift resume --explain` before any real
+   resume attempt when the run was interrupted
+10. use `night-shift resolve` or `night-shift resume` only if the run actually
    requires it
 
 For review-driven investigations, replace steps 3-4 with:
@@ -167,6 +173,13 @@ In review-driven runs, pay attention to repo-state evidence:
   manual attention
 - whether `status` and `report` show payload-repair attempts, successes, and
   failures with usable artifact paths
+- whether `status`, `report`, and the dashboard agree on the confidence posture
+  and its reasons
+- whether `provenance` records the expected prompt paths, payload artifacts,
+  verification evidence, worktree paths, and PR linkage
+- whether `doctor` classifies interrupted tasks as `safe_to_resume`,
+  `resume_with_warning`, `manual_attention`, or `irrecoverable` for the actual
+  saved repo state
 
 Use small tasks that validate the requested behavior instead of inviting large
 feature work.
@@ -180,6 +193,7 @@ Collect evidence from:
 - relevant CLI output
 - the current report path printed by Night Shift
 - run journal paths under `.night-shift/runs/`
+- the `provenance.json` path and any task-specific artifact paths it surfaces
 - relevant logs for the failing or surprising step
 - PR or delivery results when they happen
 - any verification output tied to the run

diff --git a/README.md b/README.md
@@ -31,11 +31,14 @@ night-shift plan --notes notes/today.md
 night-shift start
 night-shift status
 night-shift report
+night-shift provenance
 ```
 
 Supporting commands round out the lifecycle:
 
 - `resolve` records answers for blocked planning decisions and replans the run
+- `doctor` explains whether a saved run is safe to resume and why
+- `provenance` renders a per-run evidence ledger from saved artifacts
 - `resume` recovers an interrupted run from saved state
 - `plan --from-reviews` turns open Night Shift PR feedback into a fresh
   successor stack
@@ -105,6 +108,7 @@ Inspect progress and outputs:
 ```sh
 night-shift status
 night-shift report
+night-shift provenance
 ```
 
 If planning blocked on manual decisions:
@@ -117,6 +121,8 @@ night-shift start
 If Night Shift was interrupted mid-run:
 
 ```sh
+night-shift doctor
+night-shift resume --explain
 night-shift resume
 ```
 

diff --git a/docs/README.md b/docs/README.md
@@ -14,7 +14,8 @@ If you are new to the project, start here:
 - [Getting Started](getting-started.md) for install, prerequisites, and the
   first runnable flow
 - [Run Lifecycle](run-lifecycle.md) for how `plan`, `start`, `resolve`,
-  `resume`, `plan --from-reviews`, and `reset` fit together
+  `resume`, `doctor`, `provenance`, `plan --from-reviews`, and `reset` fit
+  together
 - [Configuration](configuration.md) for `config.toml` profiles and override
   precedence
 - [Worktree Environments](worktree-environments.md) for
@@ -40,11 +41,14 @@ night-shift plan --notes notes/today.md
 night-shift start
 night-shift status
 night-shift report
+night-shift provenance
 ```
 
 Supporting flows handle the messier parts of reality:
 
 - `resolve` records answers for manual-attention tasks and replans in place
+- `doctor` explains whether an interrupted run looks safe to resume
+- `provenance` prints the run's evidence ledger
 - `resume` reattaches to an interrupted run
 - `plan --from-reviews` turns open Night Shift PR feedback into a fresh successor stack
 - `reset` removes Night Shift state and tracked task worktrees, but does not touch local branches or remote PRs

diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -121,11 +121,12 @@ Use these commands while a run is active or after it finishes:
 ```sh
 night-shift status
 night-shift report
+night-shift provenance
 ```
 
-`status` prints the current run state, planning and execution agent summaries,
-notes source, event count, and report location. `report` prints the current
-markdown report directly.
+`status` prints the current run state, confidence posture, provenance path,
+and report location. `report` prints the current markdown report directly, and
+`provenance` prints the run's evidence ledger from the saved artifact graph.
 
 ## Supporting Flows
 
@@ -140,10 +141,16 @@ night-shift start
 If a run was interrupted, resume from the saved journal:
 
 ```sh
+night-shift doctor
+night-shift resume --explain
 night-shift resume
 night-shift resume --ui
 ```
 
+`doctor` is the dry recovery pass. It classifies each task as
+`safe_to_resume`, `resume_with_warning`, `manual_attention`, or
+`irrecoverable` before you mutate any run state.
+
 If open Night Shift pull requests received feedback and you want a fresh
 replacement stack instead of in-place edits:
 

diff --git a/docs/index.md b/docs/index.md
@@ -33,9 +33,10 @@ night-shift report
 ```
 
 Use `resolve` when planning needs human decisions, `resume` when a run was
-interrupted, `plan --from-reviews` when open Night Shift PRs need a fresh
-successor stack, and `reset` when you need to eject the repo-local control
-plane and start over.
+interrupted, `doctor` or `resume --explain` when you want a dry recovery read,
+`plan --from-reviews` when open Night Shift PRs need a fresh successor stack,
+and `reset` when you need to eject the repo-local control plane and start
+over.
 
 ## Repository
 

diff --git a/docs/run-lifecycle.md b/docs/run-lifecycle.md
@@ -58,6 +58,8 @@ and the next action becomes `night-shift start`.
 `resume` is the recovery path for an interrupted run:
 
 ```sh
+night-shift doctor
+night-shift resume --explain
 night-shift resume
 night-shift resume --run run-123 --ui
 ```
@@ -66,6 +68,11 @@ Night Shift reloads the saved run, validates the saved environment, recovers
 in-flight tasks, and continues orchestration. It does not re-resolve provider
 or environment settings; it reuses what the run journal already saved.
 
+`doctor` and `resume --explain` are the read-only recovery surfaces. They
+inspect the saved run, active lock, worktrees, logs, review drift, and
+interrupted task states, then classify each task as `safe_to_resume`,
+`resume_with_warning`, `manual_attention`, or `irrecoverable`.
+
 ## Review-Driven Replanning
 
 Review feedback re-enters Night Shift through `plan --from-reviews`:
@@ -99,6 +106,21 @@ it recomputes drift against the current PR tree when the run has a stored
 review snapshot, while the on-disk `report.md` remains the stable persisted
 artifact for the run.
 
+## Provenance
+
+`provenance` is the operator-facing evidence ledger for a run:
+
+```sh
+night-shift provenance
+night-shift provenance --run run-123 --format json
+night-shift provenance --task task-1
+```
+
+Night Shift persists `./.night-shift/runs/<run-id>/provenance.json` alongside
+`report.md`. The command normalizes the run journal, prompt artifacts, logs,
+payload-repair traces, verification artifacts, worktree paths, and confidence
+posture into one inspectable view.
+
 ## Reset
 
 `reset` is the eject handle when the repo-local control plane has to go:
@@ -141,6 +163,7 @@ Night Shift binds to `127.0.0.1`, prefers port `8787`, and serves:
 - run history for the current repository
 - run summary metadata
 - repo-state summary for review-driven runs, including open PR counts and drift
+- confidence posture and provenance path
 - task status
 - event timeline
 - report content

diff --git a/docs/state-and-artifacts.md b/docs/state-and-artifacts.md
@@ -33,6 +33,7 @@ Each run directory contains durable state for one run:
 - `state.json`
 - `events.jsonl`
 - `report.md`
+- `provenance.json`
 - `logs/`
 - `worktrees/`
 
@@ -53,6 +54,11 @@ The run record itself stores:
 - task list and task states
 - timestamps and current run status
 
+`provenance.json` is the normalized evidence ledger for the run. It reuses the
+saved run state plus artifact paths under `logs/` to record planning
+provenance, prompt and payload traces, verification evidence, touched files,
+worktree paths, PR linkage, and confidence posture.
+
 ## Planning Artifacts
 
 Planning writes artifacts under `./.night-shift/planning/<timestamp>/`. Those
@@ -88,6 +94,10 @@ review-driven runs: it refreshes repo-state drift against the current open PR
 tree when a stored snapshot exists, so its live output is authoritative for
 current drift while `report.md` remains durable and offline-readable.
 
+Likewise, the persisted `provenance.json` is the stable audit artifact for the
+run, while `night-shift provenance` can render the same evidence in markdown or
+refresh live review drift in JSON output.
+
 Task-level provider logs and prompt files live under each run's `logs/`
 directory.
 

diff --git a/src/night_shift/app.gleam b/src/night_shift/app.gleam
@@ -25,8 +25,10 @@ import night_shift/repo_state_runtime
 import night_shift/report
 import night_shift/system
 import night_shift/types
+import night_shift/usecase/doctor as doctor_usecase
 import night_shift/usecase/init as init_usecase
 import night_shift/usecase/plan as plan_usecase
+import night_shift/usecase/provenance as provenance_usecase
 import night_shift/usecase/render as usecase_render
 import night_shift/usecase/reset as reset_usecase
 import night_shift/usecase/resolve as resolve_usecase
@@ -131,9 +133,13 @@ fn run_initialized_command(
     types.Start(run, True) -> start_with_ui(repo_root, run, config)
     types.Status(run) -> io.println(status(repo_root, run, config))
     types.Report(run) -> io.println(report(repo_root, run, config))
+    types.Provenance(run, task_id, format) ->
+      io.println(provenance(repo_root, run, task_id, format, config))
+    types.Doctor(run) -> io.println(doctor(repo_root, run, config))
     types.Resolve(run) -> io.println(resolve(repo_root, run, config))
-    types.Resume(run, False) -> io.println(resume(repo_root, run, config))
-    types.Resume(run, True) -> resume_with_ui(repo_root, run, config)
+    types.Resume(run, False, False) -> io.println(resume(repo_root, run, config))
+    types.Resume(run, True, False) -> resume_with_ui(repo_root, run, config)
+    types.Resume(run, False, True) -> io.println(doctor(repo_root, run, config))
     _ -> io.println("Unsupported command.")
   }
 }
@@ -276,6 +282,30 @@ fn resume(
   }
 }
 
+fn doctor(
+  repo_root: String,
+  run: types.RunSelector,
+  config: types.Config,
+) -> String {
+  case doctor_usecase.execute(repo_root, run, config) {
+    Ok(rendered) -> rendered
+    Error(message) -> message
+  }
+}
+
+fn provenance(
+  repo_root: String,
+  run: types.RunSelector,
+  task_id: Option(String),
+  format: types.ProvenanceFormat,
+  config: types.Config,
+) -> String {
+  case provenance_usecase.execute(repo_root, run, task_id, format, config) {
+    Ok(rendered) -> rendered
+    Error(message) -> message
+  }
+}
+
 fn stringify_notifiers(notifiers: List(types.NotifierName)) -> String {
   notifiers
   |> list.map(types.notifier_to_string)

diff --git a/src/night_shift/cli.gleam b/src/night_shift/cli.gleam
@@ -18,8 +18,10 @@ pub fn usage() -> String {
   <> "  start [--run <id>|latest] [--ui]\n"
   <> "  status [--run <id>|latest]\n"
   <> "  report [--run <id>|latest]\n"
+  <> "  provenance [--run <id>|latest] [--task <task-id>] [--format <json|md>]\n"
+  <> "  doctor [--run <id>|latest]\n"
   <> "  resolve [--run <id>|latest]\n"
-  <> "  resume [--run <id>|latest] [--ui]\n"
+  <> "  resume [--run <id>|latest] [--ui|--explain]\n"
 }
 
 /// Parse raw command-line arguments into a `Command`.
@@ -48,6 +50,8 @@ pub fn parse(args: List(String)) -> Result(types.Command, String) {
         ["start", ..rest] -> parse_start(rest)
         ["status", ..rest] -> parse_run_lookup(rest, types.Status)
         ["report", ..rest] -> parse_run_lookup(rest, types.Report)
+        ["provenance", ..rest] -> parse_provenance(rest)
+        ["doctor", ..rest] -> parse_run_lookup(rest, types.Doctor)
         ["resolve", ..rest] -> parse_run_lookup(rest, types.Resolve)
         ["resume", ..rest] -> parse_resume(rest)
         ["review", ..] ->
@@ -256,21 +260,56 @@ fn parse_start_flags(
 }
 
 fn parse_resume(args: List(String)) -> Result(types.Command, String) {
-  parse_resume_flags(args, types.LatestRun, False)
+  parse_resume_flags(args, types.LatestRun, False, False)
 }
 
 fn parse_resume_flags(
   args: List(String),
   run: types.RunSelector,
   ui_enabled: Bool,
+  explain_only: Bool,
 ) -> Result(types.Command, String) {
   case args {
-    [] -> Ok(types.Resume(run, ui_enabled))
+    [] ->
+      case ui_enabled && explain_only {
+        True -> Error("`resume --explain` cannot be combined with `--ui`.")
+        False -> Ok(types.Resume(run, ui_enabled, explain_only))
+      }
+    ["--run", "latest", ..rest] ->
+      parse_resume_flags(rest, types.LatestRun, ui_enabled, explain_only)
+    ["--run", run_id, ..rest] ->
+      parse_resume_flags(rest, types.RunId(run_id), ui_enabled, explain_only)
+    ["--ui", ..rest] -> parse_resume_flags(rest, run, True, explain_only)
+    ["--explain", ..rest] ->
+      parse_resume_flags(rest, run, ui_enabled, True)
+    [flag, ..] -> Error("Unsupported flag: " <> flag)
+  }
+}
+
+fn parse_provenance(args: List(String)) -> Result(types.Command, String) {
+  parse_provenance_flags(args, types.LatestRun, None, types.ProvenanceMarkdown)
+}
+
+fn parse_provenance_flags(
+  args: List(String),
+  run: types.RunSelector,
+  task_id: Option(String),
+  format: types.ProvenanceFormat,
+) -> Result(types.Command, String) {
+  case args {
+    [] -> Ok(types.Provenance(run, task_id, format))
     ["--run", "latest", ..rest] ->
-      parse_resume_flags(rest, types.LatestRun, ui_enabled)
+      parse_provenance_flags(rest, types.LatestRun, task_id, format)
     ["--run", run_id, ..rest] ->
-      parse_resume_flags(rest, types.RunId(run_id), ui_enabled)
-    ["--ui", ..rest] -> parse_resume_flags(rest, run, True)
+      parse_provenance_flags(rest, types.RunId(run_id), task_id, format)
+    ["--task", next_task_id, ..rest] ->
+      parse_provenance_flags(rest, run, Some(next_task_id), format)
+    ["--format", "json", ..rest] ->
+      parse_provenance_flags(rest, run, task_id, types.ProvenanceJson)
+    ["--format", "md", ..rest] ->
+      parse_provenance_flags(rest, run, task_id, types.ProvenanceMarkdown)
+    ["--format", raw_format, ..] ->
+      Error("Unsupported provenance format: " <> raw_format)
     [flag, ..] -> Error("Unsupported flag: " <> flag)
   }
 }