Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 46 additions & 3 deletions .codex/skills/qa-night-shift/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: qa-night-shift
description: Use when the user wants to QA test Night Shift against a user-specified scratch repo path, install the current worktree CLI, and run an approval-gated real-provider pass to validate init/plan/start/status/report/resolve/resume behavior.
description: Use when the user wants to QA test Night Shift against a user-specified scratch repo path, install the current worktree CLI, and run an approval-gated real-provider pass to validate init/plan/start/status/report/provenance/doctor/resolve/resume behavior.
---

# QA Night Shift
Expand Down Expand Up @@ -59,7 +59,10 @@ real inference spend.
- If it does look like an intentional testing target, proceed.
- Even for an obvious scratch repo, do not run `night-shift plan`,
`night-shift start`, `night-shift resume`, or other inference-consuming QA
steps until the user approves the presented plan.
steps until the user approves the presented plan. Read-only checks such as
`night-shift status`, `night-shift report`, `night-shift provenance`,
`night-shift doctor`, or `night-shift resume --explain` are acceptable once
the user-approved QA pass reaches the relevant state.

Do not quietly assume a normal product repo is safe to use for QA.

Expand Down Expand Up @@ -123,7 +126,10 @@ Typical flow:
5. inspect `night-shift status`
6. run `night-shift start`
7. inspect `night-shift report`
8. use `night-shift resolve` or `night-shift resume` only if the run actually
8. inspect `night-shift provenance`
9. use `night-shift doctor` or `night-shift resume --explain` before any real
resume attempt when the run was interrupted
10. use `night-shift resolve` or `night-shift resume` only if the run actually
requires it

For review-driven investigations, replace steps 3-4 with:
Expand Down Expand Up @@ -167,6 +173,42 @@ In review-driven runs, pay attention to repo-state evidence:
manual attention
- whether `status` and `report` show payload-repair attempts, successes, and
failures with usable artifact paths
- whether `status`, `report`, and the dashboard agree on the confidence posture
and its reasons
- whether `provenance` records the expected prompt paths, payload artifacts,
verification evidence, worktree paths, and PR linkage
- whether `doctor` classifies interrupted tasks as `safe_to_resume`,
`resume_with_warning`, `manual_attention`, or `irrecoverable` for the actual
saved repo state

In delivery-focused investigations, also validate reviewer handoff behavior
when the repo config uses `[handoff]`:

- whether the delivered PR body includes or omits the Night Shift-owned
handoff overlay according to `pr_body_mode`
- whether Night Shift preserves manual PR text outside its marked body region
across later updates
- whether configured snippet files are spliced into the PR body or managed
comment in the expected order
- whether unreadable snippet paths degrade to `pr_handoff_warning` evidence
instead of blocking PR delivery
- whether managed comments stay disabled by default and only appear when
`[handoff].managed_comment = true`
- whether the managed comment is updated in place instead of adding new comment
noise on each delivery
- whether handoff provenance labels clearly separate deterministic Night
Shift-owned evidence from provider-authored summary text

In execution-focused investigations, also validate runtime identity behavior:

- whether prepared tasks get runtime identity evidence in `status`, `report`,
and `.night-shift/runs/<run-id>/runtime/<task-id>/`
- whether `night-shift.env`, `night-shift.runtime.json`, and
`night-shift.handoff.md` exist under the run runtime directory instead of
inside the git worktree
- whether setup or maintenance commands can consume injected
`NIGHT_SHIFT_COMPOSE_PROJECT`, `NIGHT_SHIFT_PORT_BASE`, and
`NIGHT_SHIFT_RUNTIME_MANIFEST` values without extra operator wiring

Use small tasks that validate the requested behavior instead of inviting large
feature work.
Expand All @@ -180,6 +222,7 @@ Collect evidence from:
- relevant CLI output
- the current report path printed by Night Shift
- run journal paths under `.night-shift/runs/`
- the `provenance.json` path and any task-specific artifact paths it surfaces
- relevant logs for the failing or surprising step
- PR or delivery results when they happen
- any verification output tied to the run
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,14 @@ night-shift plan --notes notes/today.md
night-shift start
night-shift status
night-shift report
night-shift provenance
```

Supporting commands round out the lifecycle:

- `resolve` records answers for blocked planning decisions and replans the run
- `doctor` explains whether a saved run is safe to resume and why
- `provenance` renders a per-run evidence ledger from saved artifacts
- `resume` recovers an interrupted run from saved state
- `plan --from-reviews` turns open Night Shift PR feedback into a fresh
successor stack
Expand Down Expand Up @@ -105,6 +108,7 @@ Inspect progress and outputs:
```sh
night-shift status
night-shift report
night-shift provenance
```

If planning blocked on manual decisions:
Expand All @@ -117,6 +121,8 @@ night-shift start
If Night Shift was interrupted mid-run:

```sh
night-shift doctor
night-shift resume --explain
night-shift resume
```

Expand Down
6 changes: 5 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ If you are new to the project, start here:
- [Getting Started](getting-started.md) for install, prerequisites, and the
first runnable flow
- [Run Lifecycle](run-lifecycle.md) for how `plan`, `start`, `resolve`,
`resume`, `plan --from-reviews`, and `reset` fit together
`resume`, `doctor`, `provenance`, `plan --from-reviews`, and `reset` fit
together
- [Configuration](configuration.md) for `config.toml` profiles and override
precedence
- [Worktree Environments](worktree-environments.md) for
Expand All @@ -40,11 +41,14 @@ night-shift plan --notes notes/today.md
night-shift start
night-shift status
night-shift report
night-shift provenance
```

Supporting flows handle the messier parts of reality:

- `resolve` records answers for manual-attention tasks and replans in place
- `doctor` explains whether an interrupted run looks safe to resume
- `provenance` prints the run's evidence ledger
- `resume` reattaches to an interrupted run
- `plan --from-reviews` turns open Night Shift PR feedback into a fresh successor stack
- `reset` removes Night Shift state and tracked task worktrees, but does not touch local branches or remote PRs
Expand Down
41 changes: 40 additions & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Configuration
description: Configure profiles, phase defaults, verification commands, and provider overrides.
description: Configure profiles, phase defaults, verification commands, handoff behavior, and provider overrides.
permalink: /configuration/
---

Expand Down Expand Up @@ -50,6 +50,13 @@ mode = "ask"

[verification]
commands = ["gleam test"]

[handoff]
enabled = true
pr_body_mode = "append"
managed_comment = false
provenance = "structured"
pr_body_prefix_path = ".night-shift/pr-handoff-prefix.md"
```

If `config.toml` is empty, Night Shift still works. The built-in default
Expand Down Expand Up @@ -117,6 +124,38 @@ These top-level settings shape how Night Shift delivers completed work:
- `notifiers`: currently `console` and `report_file`
- `[verification].commands`: commands to run locally before PR delivery

## Handoff Settings

`[handoff]` controls the optional reviewer-facing metadata that Night Shift can
overlay onto delivered pull requests.

Supported fields:

- `enabled`: master switch for Night Shift handoff output
- `pr_body_mode`: `off`, `append`, or `prepend`
- `managed_comment`: whether Night Shift owns and updates one incremental PR
comment with "Since Last Review" deltas
- `provenance`: `minimal`, `light`, or `structured`
- `include_files_touched`
- `include_acceptance`
- `include_stack_context`
- `include_verification_summary`
- `pr_body_prefix_path`, `pr_body_suffix_path`
- `comment_prefix_path`, `comment_suffix_path`

When `[handoff]` is absent, Night Shift uses the conservative default:

- handoff enabled
- PR body overlay appended
- managed comment disabled
- structured provenance
- files touched, stack context, and verification summary included

Snippet paths are repo-relative markdown fragments. Night Shift splices them
around its generated handoff sections; they augment the structured layout and
do not replace it. If a configured snippet path cannot be read, Night Shift
falls back to generated content and records a warning event.

Example configs live in:

- `examples/config-single-profile.toml`
Expand Down
21 changes: 19 additions & 2 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,18 +114,29 @@ Before it starts, Night Shift checks that the source repository is clean apart
from changes inside `./.night-shift/`. That guard exists so worktree execution
and delivery stay aligned with the source checkout.

When a task worktree is prepared, Night Shift also generates deterministic
runtime artifacts under the run directory and injects stable `NIGHT_SHIFT_*`
variables into setup, maintenance, provider execution, and verification. The
zero-config defaults are usually enough; add `runtime.named_ports` in
`worktree-setup.toml` only when you want friendly aliases like
`NIGHT_SHIFT_PORT_WEB`.

## Inspect Results

Use these commands while a run is active or after it finishes:

```sh
night-shift status
night-shift report
night-shift provenance
```

`status` prints the current run state, planning and execution agent summaries,
notes source, event count, and report location. `report` prints the current
markdown report directly.
confidence posture, provenance path, notes source, event count, runtime
identity counts, and report location. `report` prints the current markdown
report directly, including per-task runtime manifest and handoff paths once
worktrees have been prepared. `provenance` prints the run's evidence ledger
from the saved artifact graph.

## Supporting Flows

Expand All @@ -140,10 +151,16 @@ night-shift start
If a run was interrupted, resume from the saved journal:

```sh
night-shift doctor
night-shift resume --explain
night-shift resume
night-shift resume --ui
```

`doctor` is the dry recovery pass. It classifies each task as
`safe_to_resume`, `resume_with_warning`, `manual_attention`, or
`irrecoverable` before you mutate any run state.

If open Night Shift pull requests received feedback and you want a fresh
replacement stack instead of in-place edits:

Expand Down
7 changes: 4 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,10 @@ night-shift report
```

Use `resolve` when planning needs human decisions, `resume` when a run was
interrupted, `plan --from-reviews` when open Night Shift PRs need a fresh
successor stack, and `reset` when you need to eject the repo-local control
plane and start over.
interrupted, `doctor` or `resume --explain` when you want a dry recovery read,
`plan --from-reviews` when open Night Shift PRs need a fresh successor stack,
and `reset` when you need to eject the repo-local control plane and start
over.

## Repository

Expand Down
27 changes: 26 additions & 1 deletion docs/providers-and-delivery.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ Night Shift's current delivery model is:
- each completed task is delivered as a pull request
- dependent tasks may be delivered as stacked pull requests
- verification runs locally before PR creation
- Night Shift can overlay a configurable reviewer handoff block onto the PR
body, with repo-local markdown snippets before or after the generated block
- the local markdown report is updated throughout the run
- `night-shift report` is the live audit view for review-driven runs and can
show current drift against the saved open-PR snapshot
Expand All @@ -82,7 +84,30 @@ Night Shift's current delivery model is:
worktree before falling back to manual attention

Delivery behavior is shaped by `base_branch`, `branch_prefix`,
`pr_title_prefix`, and `[verification].commands` in `config.toml`.
`pr_title_prefix`, `[verification].commands`, and `[handoff]` in
`config.toml`.

## Reviewer Handoff

When handoff output is enabled, Night Shift can add a structured PR-body region
covering:

- context for why the PR exists
- scope such as `files_touched`, acceptance cues, and stack/supersession
metadata when configured
- model-authored summary text and known risks
- deterministic evidence such as verification output
- provenance labels that distinguish Night Shift-owned facts from inferred
provider-authored text

Night Shift encloses its PR-body overlay in stable markers and only rewrites
that marked region on later updates, so manual text outside the markers can
survive future delivery passes.

If `[handoff].managed_comment = true`, Night Shift also owns one PR comment for
incremental review deltas such as "Since Last Review", review-driven context,
and replacement-stack status. Repositories with stricter comment etiquette can
leave that disabled and still use the PR-body overlay.

## Dashboard

Expand Down
23 changes: 23 additions & 0 deletions docs/run-lifecycle.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ and the next action becomes `night-shift start`.
`resume` is the recovery path for an interrupted run:

```sh
night-shift doctor
night-shift resume --explain
night-shift resume
night-shift resume --run run-123 --ui
```
Expand All @@ -66,6 +68,11 @@ Night Shift reloads the saved run, validates the saved environment, recovers
in-flight tasks, and continues orchestration. It does not re-resolve provider
or environment settings; it reuses what the run journal already saved.

`doctor` and `resume --explain` are the read-only recovery surfaces. They
inspect the saved run, active lock, worktrees, logs, review drift, and
interrupted task states, then classify each task as `safe_to_resume`,
`resume_with_warning`, `manual_attention`, or `irrecoverable`.

## Review-Driven Replanning

Review feedback re-enters Night Shift through `plan --from-reviews`:
Expand Down Expand Up @@ -99,6 +106,21 @@ it recomputes drift against the current PR tree when the run has a stored
review snapshot, while the on-disk `report.md` remains the stable persisted
artifact for the run.

## Provenance

`provenance` is the operator-facing evidence ledger for a run:

```sh
night-shift provenance
night-shift provenance --run run-123 --format json
night-shift provenance --task task-1
```

Night Shift persists `./.night-shift/runs/<run-id>/provenance.json` alongside
`report.md`. The command normalizes the run journal, prompt artifacts, logs,
payload-repair traces, verification artifacts, worktree paths, and confidence
posture into one inspectable view.

## Reset

`reset` is the eject handle when the repo-local control plane has to go:
Expand Down Expand Up @@ -141,6 +163,7 @@ Night Shift binds to `127.0.0.1`, prefers port `8787`, and serves:
- run history for the current repository
- run summary metadata
- repo-state summary for review-driven runs, including open PR counts and drift
- confidence posture and provenance path
- task status
- event timeline
- report content
Expand Down
Loading