Skip to content

docs: add CI triage guide (issue #219)#279

Open
mvillmow wants to merge 4 commits into
mainfrom
219-auto-impl
Open

docs: add CI triage guide (issue #219)#279
mvillmow wants to merge 4 commits into
mainfrom
219-auto-impl

Conversation

@mvillmow

@mvillmow mvillmow commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Investigation of issue #219 reveals the three reported 'Set up job' failures (runs 24940322404, 24940042318, 24935454744) were caused by an invalid pixi action SHA that was corrected ~1 month ago in commit 87dc30e. Recent CI runs (last 20) show the pattern is now resolved: 18 passing, 2 recent failures on a different cause (Install pixi step, unrelated to "Set up job").

This is Route A: No repo-side defect found, failures transient.

Investigation Evidence

Pattern Analysis

  • Last 20 ci.yml runs: 2 failures, 18 successes
  • Threshold: ≤3 failures = isolated pattern ✓
  • Failure conclusion: Isolated, not persistent

Workflow Audit (Three Checks)

  1. Permissions block: Not present in .github/workflows/ci.yml — standard GitHub-hosted runners don't require one ✓
  2. Runner labels: Both jobs use ubuntu-latest (standard label) ✓
  3. Secrets references: No ${{ secrets.* }} references in workflow ✓

Root Cause of Original 3 Runs

  • Run 24940322404 (2026-04-25): Lint & Test failed with "Unable to resolve action prefix-dev/setup-pixi@b4fa1ac3900b5ac04b5b07ec9a7a7fe9d90c5258"
  • Run 24940042318 (2026-04-25): Same pixi action resolution error
  • Run 24935454744 (2026-04-21): Same pixi action resolution error

Fix applied: Commit 87dc30e ("fix(ci): correct setup-pixi SHA pin to valid v0.8.1 commit (#221)") corrected the invalid SHA. Subsequent Dependabot PRs (#244, #258) bumped to v0.9.6 with valid SHAs.

Current main branch: prefix-dev/setup-pixi@5185adfbffb4bd703da3010310260805d89ebb11 (v0.9.6) — valid and passing ✓

Deliverable

This PR adds docs/ci-triage.md — a triage runbook for future "Set up job" and pre-checkout failures. The guide covers:

  • Three-check audit (permissions, runner labels, secrets)
  • Re-run / workflow_dispatch fallback strategy
  • Escalation path (GitHub Support + needs-github-support label)
  • Common root causes with examples

Added link from CLAUDE.md to the triage guide.

Closes #219

@mvillmow mvillmow left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc-only Route A is correct (ci.yml has no permissions/secrets refs), but triage doc cites a fabricated PR #244 / v0.9.6 setup-pixi bump; actual is #235 / v0.9.5.

Comment thread docs/ci-triage.md
Comment thread docs/ci-triage.md

@mvillmow mvillmow left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc-only deliverable correct; both prior review threads resolved; src/tests changes are formatter-forced churn. One minor: unverified commit SHA in runbook footnote.

Comment thread docs/ci-triage.md
mvillmow and others added 2 commits June 28, 2026 10:12
Investigation of issue #219 found three 'Set up job' failures (runs 24940322404,
24940042318, 24935454744) were due to an invalid pixi action SHA that was
corrected ~1 month ago in commit 87dc30e. Recent runs show all passing.

This is Route A (no repo-side defect found, failures transient). The triage
guide documents the three-check audit (permissions, runner labels, secrets)
for future 'Set up job' infrastructure failures and the escalation path for
GitHub Support.

Investigation summary:
- Pattern analysis: 2 failures in last 20 runs (≤3 = isolated) ✓
- Permissions audit: no permissions block (OK for standard GitHub runners) ✓
- Runner labels: both ubuntu-latest (standard) ✓
- Secrets references: none in workflow ✓
- Root cause of original 3 failures: invalid prefix-dev/setup-pixi SHA (fixed)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
- Replace `\s*` with `[[:space:]]*` in the secrets grep/sed commands
  so the runbook works on macOS/BSD grep, not just GNU grep
- Fix fabricated citation: PR #244 / v0.9.6 → PR #235 / v0.9.5
  (the actual setup-pixi bump per git history)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
Remove temporary investigation file for CI 'Set up job' failure.

The investigation concluded this is a runner infrastructure issue
(not a code defect), documented in docs/ci-triage.md.

Closes #219

Implemented-By: claude-sonnet-4-6
Co-Authored-By: Claude Code <noreply@anthropic.com>
Signed-off-by: mvillmow <4211002+mvillmow@users.noreply.github.com>
@mvillmow mvillmow enabled auto-merge (squash) June 28, 2026 18:45
GHSA-4xgf-cpjx-pc3j)

Signed-off-by: Micah Villmow <4211002+mvillmow@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

investigate: CI workflow 'Set up job' failure - runner infrastructure issue

1 participant