Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .codex/skills/qa-night-shift/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,24 @@ In review-driven runs, pay attention to repo-state evidence:
- whether `status` and `report` show payload-repair attempts, successes, and
failures with usable artifact paths

In delivery-focused investigations, also validate reviewer handoff behavior
when the repo config uses `[handoff]`:

- whether the delivered PR body includes or omits the Night Shift-owned
handoff overlay according to `pr_body_mode`
- whether Night Shift preserves manual PR text outside its marked body region
across later updates
- whether configured snippet files are spliced into the PR body or managed
comment in the expected order
- whether unreadable snippet paths degrade to `pr_handoff_warning` evidence
instead of blocking PR delivery
- whether managed comments stay disabled by default and only appear when
`[handoff].managed_comment = true`
- whether the managed comment is updated in place instead of adding new comment
noise on each delivery
- whether handoff provenance labels clearly separate deterministic Night
Shift-owned evidence from provider-authored summary text

Use small tasks that validate the requested behavior instead of inviting large
feature work.

Expand Down
41 changes: 40 additions & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Configuration
description: Configure profiles, phase defaults, verification commands, and provider overrides.
description: Configure profiles, phase defaults, verification commands, handoff behavior, and provider overrides.
permalink: /configuration/
---

Expand Down Expand Up @@ -50,6 +50,13 @@ mode = "ask"

[verification]
commands = ["gleam test"]

[handoff]
enabled = true
pr_body_mode = "append"
managed_comment = false
provenance = "structured"
pr_body_prefix_path = ".night-shift/pr-handoff-prefix.md"
```

If `config.toml` is empty, Night Shift still works. The built-in default
Expand Down Expand Up @@ -117,6 +124,38 @@ These top-level settings shape how Night Shift delivers completed work:
- `notifiers`: currently `console` and `report_file`
- `[verification].commands`: commands to run locally before PR delivery

## Handoff Settings

`[handoff]` controls the optional reviewer-facing metadata that Night Shift can
overlay onto delivered pull requests.

Supported fields:

- `enabled`: master switch for Night Shift handoff output
- `pr_body_mode`: `off`, `append`, or `prepend`
- `managed_comment`: whether Night Shift owns and updates one incremental PR
comment with "Since Last Review" deltas
- `provenance`: `minimal`, `light`, or `structured`
- `include_files_touched`
- `include_acceptance`
- `include_stack_context`
- `include_verification_summary`
- `pr_body_prefix_path`, `pr_body_suffix_path`
- `comment_prefix_path`, `comment_suffix_path`

When `[handoff]` is absent, Night Shift uses the conservative default:

- handoff enabled
- PR body overlay appended
- managed comment disabled
- structured provenance
- files touched, stack context, and verification summary included

Snippet paths are repo-relative markdown fragments. Night Shift splices them
around its generated handoff sections; they augment the structured layout and
do not replace it. If a configured snippet path cannot be read, Night Shift
falls back to generated content and records a warning event.

Example configs live in:

- `examples/config-single-profile.toml`
Expand Down
27 changes: 26 additions & 1 deletion docs/providers-and-delivery.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ Night Shift's current delivery model is:
- each completed task is delivered as a pull request
- dependent tasks may be delivered as stacked pull requests
- verification runs locally before PR creation
- Night Shift can overlay a configurable reviewer handoff block onto the PR
body, with repo-local markdown snippets before or after the generated block
- the local markdown report is updated throughout the run
- `night-shift report` is the live audit view for review-driven runs and can
show current drift against the saved open-PR snapshot
Expand All @@ -82,7 +84,30 @@ Night Shift's current delivery model is:
worktree before falling back to manual attention

Delivery behavior is shaped by `base_branch`, `branch_prefix`,
`pr_title_prefix`, and `[verification].commands` in `config.toml`.
`pr_title_prefix`, `[verification].commands`, and `[handoff]` in
`config.toml`.

## Reviewer Handoff

When handoff output is enabled, Night Shift can add a structured PR-body region
covering:

- context for why the PR exists
- scope such as `files_touched`, acceptance cues, and stack/supersession
metadata when configured
- model-authored summary text and known risks
- deterministic evidence such as verification output
- provenance labels that distinguish Night Shift-owned facts from inferred
provider-authored text

Night Shift encloses its PR-body overlay in stable markers and only rewrites
that marked region on later updates, so manual text outside the markers can
survive future delivery passes.

If `[handoff].managed_comment = true`, Night Shift also owns one PR comment for
incremental review deltas such as "Since Last Review", review-driven context,
and replacement-stack status. Repositories with stricter comment etiquette can
leave that disabled and still use the PR-body overlay.

## Dashboard

Expand Down
14 changes: 13 additions & 1 deletion docs/state-and-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ The run record itself stores:
- planning provenance such as `notes only` or `reviews + notes`
- an open-PR repo-state snapshot for review-driven plans
- mechanically derived supersession lineage on replacement tasks
- persisted PR handoff state per delivered task, including the last delivered
commit SHA, verification digest, files list, and whether Night Shift had
emitted a body overlay or managed comment
- recorded decisions
- `planning_dirty`
- task list and task states
Expand Down Expand Up @@ -76,6 +79,8 @@ vanishing into the terminal scrollback.
- worktree retention and pruning notes
- execution recovery warnings when Night Shift accepted a sanitized or
recovered provider payload
- PR handoff warnings such as unreadable snippet paths or managed-comment
update failures
- payload-repair attempt, success, and failure notes when Night Shift retried a
malformed execution result in place
- task summaries
Expand All @@ -89,7 +94,8 @@ tree when a stored snapshot exists, so its live output is authoritative for
current drift while `report.md` remains durable and offline-readable.

Task-level provider logs and prompt files live under each run's `logs/`
directory.
directory. PR delivery also keeps the rendered pull request body under `logs/`
so operators can inspect the exact handoff Night Shift attempted to publish.

Task worktrees are intentionally sticky. Night Shift keeps them mounted after
completion so operators can inspect delivery state or resume later without
Expand All @@ -115,6 +121,12 @@ under distinct `.payload-repair.*` log and prompt artifacts. If that retry
still fails, manual-attention summaries include both the original malformed
payload path and the repair artifacts.

When `[handoff]` points at snippet files such as `pr_body_prefix_path` or
`comment_suffix_path`, Night Shift reads those repo-relative markdown files at
delivery time. Missing or unreadable snippets do not block PR delivery; Night
Shift records a `pr_handoff_warning` event and falls back to generated handoff
content.

## Active Lock

Night Shift keeps `./.night-shift/active.lock` so only one active run can
Expand Down
194 changes: 132 additions & 62 deletions src/night_shift/codec/journal.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ pub fn encode_run(run: types.RunRecord) -> String {
#("created_at", json.string(run.created_at)),
#("updated_at", json.string(run.updated_at)),
#("tasks", json.array(run.tasks, encode_task)),
#(
"handoff_states",
json.array(run.handoff_states, encode_task_handoff_state),
),
])
|> json.to_string
}
Expand Down Expand Up @@ -193,6 +197,20 @@ fn encode_task(task: types.Task) -> json.Json {
])
}

fn encode_task_handoff_state(state: types.TaskHandoffState) -> json.Json {
json.object([
#("task_id", json.string(state.task_id)),
#("delivered_pr_number", json.string(state.delivered_pr_number)),
#("last_delivered_commit_sha", json.string(state.last_delivered_commit_sha)),
#("last_handoff_files", json.array(state.last_handoff_files, json.string)),
#("last_verification_digest", json.string(state.last_verification_digest)),
#("last_risks", json.array(state.last_risks, json.string)),
#("last_handoff_updated_at", json.string(state.last_handoff_updated_at)),
#("body_region_present", json.bool(state.body_region_present)),
#("managed_comment_present", json.bool(state.managed_comment_present)),
])
}

fn encode_decision_request(request: types.DecisionRequest) -> json.Json {
json.object([
#("key", json.string(request.key)),
Expand Down Expand Up @@ -262,45 +280,56 @@ fn run_decoder() -> decode.Decoder(types.RunRecord) {
use created_at <- decode.field("created_at", decode.string)
use updated_at <- decode.field("updated_at", decode.string)
use tasks <- decode.field("tasks", decode.list(task_decoder()))
decode.success(types.RunRecord(
run_id: run_id,
repo_root: repo_root,
run_path: run_path,
brief_path: brief_path,
state_path: state_path,
events_path: events_path,
report_path: report_path,
lock_path: lock_path,
planning_agent: planning_agent,
execution_agent: execution_agent,
environment_name: case maybe_environment_name {
Some(name) -> name
None -> ""
},
max_workers: max_workers,
notes_source: notes_source,
planning_provenance: case planning_provenance {
Some(provenance) -> Some(provenance)
None ->
case notes_source {
Some(source) -> Some(types.NotesOnly(source))
None -> None
}
},
repo_state_snapshot: repo_state_snapshot,
decisions: case decisions {
Some(entries) -> entries
None -> []
},
planning_dirty: case planning_dirty {
Some(value) -> value
None -> False
},
status: status,
created_at: created_at,
updated_at: updated_at,
tasks: tasks,
))
use handoff_states <- decode.optional_field(
"handoff_states",
None,
decode.optional(decode.list(task_handoff_state_decoder())),
)
decode.success(
types.RunRecord(
run_id: run_id,
repo_root: repo_root,
run_path: run_path,
brief_path: brief_path,
state_path: state_path,
events_path: events_path,
report_path: report_path,
lock_path: lock_path,
planning_agent: planning_agent,
execution_agent: execution_agent,
environment_name: case maybe_environment_name {
Some(name) -> name
None -> ""
},
max_workers: max_workers,
notes_source: notes_source,
planning_provenance: case planning_provenance {
Some(provenance) -> Some(provenance)
None ->
case notes_source {
Some(source) -> Some(types.NotesOnly(source))
None -> None
}
},
repo_state_snapshot: repo_state_snapshot,
decisions: case decisions {
Some(entries) -> entries
None -> []
},
planning_dirty: case planning_dirty {
Some(value) -> value
None -> False
},
status: status,
created_at: created_at,
updated_at: updated_at,
tasks: tasks,
handoff_states: case handoff_states {
Some(entries) -> entries
None -> []
},
),
)
}

fn legacy_run_decoder() -> decode.Decoder(types.RunRecord) {
Expand All @@ -319,29 +348,32 @@ fn legacy_run_decoder() -> decode.Decoder(types.RunRecord) {
use updated_at <- decode.field("updated_at", decode.string)
use tasks <- decode.field("tasks", decode.list(task_decoder()))
let resolved_agent = types.resolved_agent_from_provider(provider)
decode.success(types.RunRecord(
run_id: run_id,
repo_root: repo_root,
run_path: run_path,
brief_path: brief_path,
state_path: state_path,
events_path: events_path,
report_path: report_path,
lock_path: lock_path,
planning_agent: resolved_agent,
execution_agent: resolved_agent,
environment_name: "",
max_workers: max_workers,
notes_source: None,
planning_provenance: None,
repo_state_snapshot: None,
decisions: [],
planning_dirty: False,
status: status,
created_at: created_at,
updated_at: updated_at,
tasks: tasks,
))
decode.success(
types.RunRecord(
run_id: run_id,
repo_root: repo_root,
run_path: run_path,
brief_path: brief_path,
state_path: state_path,
events_path: events_path,
report_path: report_path,
lock_path: lock_path,
planning_agent: resolved_agent,
execution_agent: resolved_agent,
environment_name: "",
max_workers: max_workers,
notes_source: None,
planning_provenance: None,
repo_state_snapshot: None,
decisions: [],
planning_dirty: False,
status: status,
created_at: created_at,
updated_at: updated_at,
tasks: tasks,
handoff_states: [],
),
)
}

fn resolved_agent_decoder() -> decode.Decoder(types.ResolvedAgentConfig) {
Expand Down Expand Up @@ -432,6 +464,44 @@ fn task_decoder() -> decode.Decoder(types.Task) {
))
}

fn task_handoff_state_decoder() -> decode.Decoder(types.TaskHandoffState) {
use task_id <- decode.field("task_id", decode.string)
use delivered_pr_number <- decode.field("delivered_pr_number", decode.string)
use last_delivered_commit_sha <- decode.field(
"last_delivered_commit_sha",
decode.string,
)
use last_handoff_files <- decode.field(
"last_handoff_files",
decode.list(decode.string),
)
use last_verification_digest <- decode.field(
"last_verification_digest",
decode.string,
)
use last_risks <- decode.field("last_risks", decode.list(decode.string))
use last_handoff_updated_at <- decode.field(
"last_handoff_updated_at",
decode.string,
)
use body_region_present <- decode.field("body_region_present", decode.bool)
use managed_comment_present <- decode.field(
"managed_comment_present",
decode.bool,
)
decode.success(types.TaskHandoffState(
task_id: task_id,
delivered_pr_number: delivered_pr_number,
last_delivered_commit_sha: last_delivered_commit_sha,
last_handoff_files: last_handoff_files,
last_verification_digest: last_verification_digest,
last_risks: last_risks,
last_handoff_updated_at: last_handoff_updated_at,
body_region_present: body_region_present,
managed_comment_present: managed_comment_present,
))
}

fn decision_request_decoder() -> decode.Decoder(types.DecisionRequest) {
use key <- decode.field("key", decode.string)
use question <- decode.field("question", decode.string)
Expand Down
Loading
Loading