fix(pepper): mark PRs as pepper-cooking during review [DEV-237]#39
Conversation
Pepper was adding the new outcome label on a re-review but leaving the prior one in place, so PRs ended up with both pepper-approved and pepper-changes-requested simultaneously and the label-based filters lost meaning. Move label lifecycle ownership from the LLM into the workflow: - A new pre-review step (review mode only) snapshots current PR labels, clears any prior outcome labels actually present, and applies pepper-cooking as a "currently being reviewed" marker. Removes are gated on presence so a fresh repo's first review doesn't error on gh pr edit --remove-label for a label not in the repo yet. - Pepper's final action swaps pepper-cooking for the chosen outcome in a single gh pr edit call (--add-label "<outcome>" --remove-label "pepper-cooking"). The four label-application points in the prompt (auto-fail, both intent-verification halts, and the central <labels> section) all use the swap form. - A new always-on cleanup step removes pepper-cooking regardless of how the job ended (Pepper success, Pepper failure, prompt-build error, AWS-creds error, runner timeout, manual cancellation), so the marker only ever appears on a PR that is actively being reviewed. Failed runs are surfaced via the workflow's run status, not via a stale label. Net behavior: re-reviews end with exactly one outcome label on the PR; in-flight reviews carry pepper-cooking; nothing else combines. Refs: DEV-237
There was a problem hiding this comment.
Verified against DEV-237 (Linear) — aligned with the stated problem of stale outcome labels.
The prompt changes are solid — every outcome path now swaps pepper-cooking for the final label in a single atomic gh pr edit call. The workflow's pre-review step correctly snapshots and clears prior labels, and the cleanup step removes the cooking marker on any exit path.
Blocker: Line 156 in .github/workflows/pepper-pr-review.yml has an unquoted array expansion "${REMOVES[@]}" under set -euo pipefail. When a PR has no prior outcome labels (first review, or labels were manually cleared), REMOVES is empty and bash's -u flag treats the expansion as an unbound variable — the step will fail before Pepper ever runs. Use the safe form "${REMOVES[@]+"${REMOVES[@]}"}" or conditionally expand only when the array is non-empty.
Second take after you patch that, henny — this fix is too close to landing to trip on an empty-array footgun.
— Miss Pepper
When you're ready for another look, comment @pepper review.
Pepper auto-review flagged an empty-array footgun: under `set -euo pipefail`
on bash 3.2 (macOS default) or bash 4.3 and earlier, `"${REMOVES[@]}"`
on an empty array errors as "unbound variable" before gh ever runs. GitHub
runners on bash 5.x are fine, but the form is brittle. Build a single
ARGS array that always carries the --add-label pair so the expansion is
never empty.
Refs: DEV-237
|
@pepper review Addressed the empty-array footgun on the cooking step. Built a single |
There was a problem hiding this comment.
Verified against DEV-237 (Linear) — aligned. The cooking-marker lifecycle directly addresses the stale-label accumulation described in the issue. The workflow change is defensive (checks label existence before removal to handle cold-start case) and the cleanup step ensures the marker never persists past job completion. The prompt updates document the swap pattern at all four label-application points, and the instructions match the workflow's actual behavior (I checked that pepper-cooking is applied in review mode and cleaned up in the always-on step). No automated test for the label lifecycle, but the PR body explicitly documents the smoke-test plan post-merge — workflow infra of this shape typically validates in prod. That set -uo pipefail without -e in the cleanup step is right: best-effort removal shouldn't fail the job if gh pr edit errors.
— Pepper (already excited to see this in the wild after the v1 tag moves)
When you're ready for another look, just comment @pepper review.
Summary
pepper-cookingas a "currently being reviewed" marker.gh pr edit --add-label "<outcome>" --remove-label "pepper-cooking"swap. All four label-application points in the prompt (auto-fail, both intent-verification halts, and the central<labels>section) use the swap form.pepper-cookingregardless of how the job ended (Pepper success/failure, prompt-build error, AWS-creds error, runner timeout, manual cancellation), so the marker only ever appears on a PR that is actively being reviewed. Failed runs are surfaced via the workflow's run status, not via a stale label.Net behavior: re-reviews end with exactly one outcome label on the PR; in-flight reviews carry only
pepper-cooking; nothing else combines.Fixes DEV-237.
Self-review caveat
This PR's own Pepper review will run with the OLD workflow + prompt — both are fetched from
refs/tags/v1, not from the PR head. The new cooking lifecycle is not exercised here. After merge, the v1 tag will be moved to the new commit and a follow-up smoke-test PR will validate the live behavior.Test plan