Skip to content

fix(ci): repair linked-client-pr gate — status-cap spam + dead PR-event trigger#163

Merged
WiktorStarczewski merged 2 commits into
mainfrom
fix/linked-client-pr-status-cap
Jun 1, 2026
Merged

fix(ci): repair linked-client-pr gate — status-cap spam + dead PR-event trigger#163
WiktorStarczewski merged 2 commits into
mainfrom
fix/linked-client-pr-status-cap

Conversation

@WiktorStarczewski

@WiktorStarczewski WiktorStarczewski commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Fixes two bugs in the Linked client PR ready gate: (1) it spammed the PR author with a failure email every 15 minutes once a PR's head SHA hit GitHub's per-(SHA, context) status cap, and (2) its PR-triggered path never actually ran.

Problem 1: status-cap spam

The Linked client PR ready gate (.github/workflows/check-linked-client-pr.yml) re-runs every 15 minutes on a schedule cron and fans out across every open PR carrying a Client PR: marker. On each run it POSTs a fresh linked-client-pr-ready commit status on the PR's head SHA.

GitHub caps each (commit SHA, context) pair at 1000 statuses. Any PR whose head SHA sits unpushed long enough accumulates 1000 statuses; the POST then returns:

422 Validation Failed: "This SHA and context has reached the maximum number of statuses."

The workflow deliberately doesn't swallow API errors (to surface a 403 from a missing statuses: write scope), so the 422 propagates to exit 1job failure → an email to the PR author every 15 minutes. This is currently firing on the long-lived PRs (#22, #23, #25, #27); newer PRs (#30, #31) pass only because they haven't hit the cap yet.

Fix

Two changes to set_status:

  1. Idempotency guard — read the latest existing linked-client-pr-ready status for the head SHA and skip the POST when state + description are unchanged. This both stops the email spam and stops accumulating statuses toward the cap in the first place.
  2. Tolerate the 422 — when a SHA has already exhausted the 1000-status cap (e.g. from before this guard existed), treat the "maximum number of statuses" response as non-fatal: the verdict is unchanged and re-POSTing is impossible anyway. Any other failure (notably a 403 from a missing statuses: write scope) still surfaces as a job failure, preserving the original safety intent.

Also routes the no-marker success path through set_status so it gets the same guards; target_url is now attached only when a linked PR is known.

Why target main

Scheduled (schedule:) workflow runs always execute the default-branch copy of the workflow file. The failing runs are all event: schedule on main, so the fix only takes effect once it's on main — hence this PR targets main directly rather than next. The file is currently identical on both branches.

Verification

  • python3 yaml.safe_load — workflow parses.
  • bash -n on the extracted run: script — no syntax errors.
  • Unit-tested the new set_status against a mocked gh for all five paths: unchanged→skip, changed→post, no-prior→post, 422→warn-and-continue, 403→fail. All pass.
  • Validated the idempotency jq against the real GET /commits/{sha}/statuses shape on PR fix(web-client): align NoteType enum discriminants with miden_client::note::NoteType #22's head SHA.

Second fix: gate never ran on PR events

While verifying, I found the gate job skipped on PR events entirely: the workflow triggers on pull_request_target but the job's if: guarded on github.event_name == 'pull_request', which never matches under that trigger. So the PR-triggered path — the whole reason pull_request_target was chosen, per the safety comment — was dead, and the gate only ever ran via the cron.

Fixed here by guarding on pull_request_target instead, so the gate now evaluates and posts linked-client-pr-ready on PR open/edit/synchronize as intended. The matrix's non-schedule branch already reads github.event.pull_request.number (populated on pull_request_target), so it needed no change.

Note on verifying both fixes: pull_request_target (like schedule) runs the workflow file from the base branch (main), not the PR's copy. So neither fix takes effect on this PR's own runs — the gate job here still skips using main's current definition. Both fixes go live only once merged to main; that's also why this PR targets main directly.

… cap

The "Linked client PR ready" gate re-runs every 15 min on a cron and
POSTed a fresh `linked-client-pr-ready` commit status on every run, even
when the verdict was unchanged. GitHub caps each (commit SHA, context)
pair at 1000 statuses, so any PR whose head SHA sat unpushed eventually
hit the cap; the POST then returned 422 and the job failed, emailing the
PR author every 15 minutes.

Two changes to set_status:

1. Idempotency guard: read the latest existing `linked-client-pr-ready`
   status for the head SHA and skip the POST when state + description are
   unchanged. This stops the email spam and stops accumulating statuses.

2. Tolerate the 422 "maximum number of statuses" response as non-fatal
   (the verdict is unchanged and re-POSTing is impossible anyway) for
   SHAs that already exhausted the cap, while still surfacing any other
   failure such as a 403 from a missing `statuses: write` scope.

Also routes the no-marker success path through set_status so it benefits
from the same guards; target_url is now attached only when a linked PR
is known.
@WiktorStarczewski WiktorStarczewski added the no changelog PR doesn't need a CHANGELOG entry (trivial / non-user-visible) label Jun 1, 2026
The workflow triggers on `pull_request_target`, but the `gate` job's
`if:` guarded on `github.event_name == 'pull_request'` — a value that
never occurs under this trigger. The PR-triggered path was therefore
dead: the gate only ever ran via the schedule cron, never when a PR was
opened, edited, or synchronized. That defeats the reason the workflow
uses `pull_request_target` (so it can post a status on fork PRs).

Guard on `pull_request_target` instead so the gate evaluates and posts
`linked-client-pr-ready` on PR events as intended. The matrix's
non-schedule branch already reads `github.event.pull_request.number`,
which is populated on `pull_request_target`, so it needs no change.
@WiktorStarczewski WiktorStarczewski changed the title fix(ci): stop linked-client-pr gate from spamming failures via status cap fix(ci): repair linked-client-pr gate — status-cap spam + dead PR-event trigger Jun 1, 2026
@WiktorStarczewski WiktorStarczewski merged commit 2f7e9ad into main Jun 1, 2026
38 checks passed
@WiktorStarczewski WiktorStarczewski deleted the fix/linked-client-pr-status-cap branch June 1, 2026 10:42
VAIBHAVJINDAL3012 pushed a commit to inicio-labs/web-sdk that referenced this pull request Jun 16, 2026
Brings the 43 main-only commits (0.14.6–0.14.11) onto next, keeping
next's 0.15 API surface. Key features restored on next:

- useWorker opt-out (ClientOptions + MidenConfig) [0xMiden#149]
- TransactionProver.newCallbackProver + JsCallbackTransactionProver [0xMiden#149]
- crates/mobile-prover (Cargo deps still 0.14 — ported to the 0.15
  workspace deps in a follow-up commit on this branch) [0xMiden#149]
- _withInnerWebClient + depth-tracked re-entrancy fix [0xMiden#152]
- Multi-threaded WASM proving: ST/MT dual build, dist/st + dist/mt
  layout, /mt + /mt/lazy subpaths, nightly toolchain pin [0xMiden#134]
- inspectable dropped from JsAccountUpdate/JsStorageMapEntry/
  JsStorageSlot/JsVaultAsset [0xMiden#150]
- react-sdk fixes: useConsume ownership [0xMiden#138], TransactionId.toHex [0xMiden#140]
- linked-client-pr gate status-cap + dead-trigger fixes [0xMiden#163]
- docs-skip 'changes' job gating in test.yml

Conflict-resolution policy: next's side for everything 0.15-shaped
(version 0.15.0-alpha.4, git-pinned miden-client, napi/node build,
js_export structure, MIDEN_CLIENT_REF pin); main's side for the MT
build system and CI hardening; unions for CHANGELOG/CLAUDE/knip and
the exports maps (node condition + dist/st layout + /mt subpaths).
auto-release-main.yml deleted (main-branch-only workflow).
Locks regenerated, not hand-merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog PR doesn't need a CHANGELOG entry (trivial / non-user-visible)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant