Skip to content

[NO-JIRA] Surface Percy silent failures as CI errors#4541

Open
Richard-Shen (RichardSyq) wants to merge 5 commits into
mainfrom
RichardSyq/percy-silent-failure-analysis
Open

[NO-JIRA] Surface Percy silent failures as CI errors#4541
Richard-Shen (RichardSyq) wants to merge 5 commits into
mainfrom
RichardSyq/percy-silent-failure-analysis

Conversation

@RichardSyq
Copy link
Copy Markdown
Contributor

Summary

  • Percy CLI exits 0 even when it fails internally (missing token, API errors, build not created), causing the PercyTests job to show success while actually failing silently
  • Added pre-validation of PERCY_TOKEN and post-run log inspection to exit 1 on known failure patterns
  • This ensures CI correctly shows a red cross when Percy cannot complete its work

What changed

The Percy Test step now:

  1. Checks PERCY_TOKEN is set before running — fails immediately if empty
  2. Captures Percy output and scans for known failure messages (Build not created, Failed to create build, Error:)
  3. Exits with code 1 if any failure is detected, surfacing it as a proper CI failure

Test plan

  • Verify normal Percy runs (with valid token) still pass
  • Verify that an empty/invalid PERCY_TOKEN causes the job to fail with a clear error message

🤖 Generated with Claude Code

Percy CLI exits 0 even when it fails internally (missing token, API
errors, build not created). This causes the PercyTests job to show
green while actually failing. Add pre-validation of PERCY_TOKEN and
post-run log inspection to exit 1 on known failure patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 22, 2026 08:40
@skyscanner-backpack-bot
Copy link
Copy Markdown

Visit https://backpack.github.io/storybook-prs/4541 to see this build running in a browser.

@RichardSyq Richard-Shen (RichardSyq) added the patch Patch production bug label May 22, 2026
@RichardSyq Richard-Shen (RichardSyq) changed the title [NO-JIRA][CI] Surface Percy silent failures as CI errors [NO-JIRA] Surface Percy silent failures as CI errors May 22, 2026
@skyscanner-backpack-bot
Copy link
Copy Markdown

Visit https://backpack.github.io/storybook-prs/4541 to see this build running in a browser.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@skyscanner-backpack-bot
Copy link
Copy Markdown

Visit https://backpack.github.io/storybook-prs/4541 to see this build running in a browser.

Replace shell-based log parsing with Percy's built-in PERCY_RAISE_ERROR
env var, which causes Percy CLI to exit non-zero on internal errors
rather than silently succeeding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@skyscanner-backpack-bot
Copy link
Copy Markdown

Visit https://backpack.github.io/storybook-prs/4541 to see this build running in a browser.

PERCY_RAISE_ERROR does not work for token/build-creation failures in
Percy CLI 1.31.x — the process still exits 0 even when it prints
"Build not created". Instead:

1. Pre-check that PERCY_TOKEN is non-empty
2. Use set -o pipefail to catch npm script failures
3. Grep for the definitive "[percy] Build not created" message

This is narrowly targeted at the exact failure pattern observed in CI
(Percy starts, processes snapshots, but cannot create the build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@skyscanner-backpack-bot
Copy link
Copy Markdown

Visit https://backpack.github.io/storybook-prs/4541 to see this build running in a browser.

Copy link
Copy Markdown
Contributor

@xiaogliu Vincent Liu (xiaogliu) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — turning the silent pass into a hard fail is the right call. A few things to consider before merging:

  1. Description vs. implementation mismatch. The PR description says the script scans for three patterns (Build not created, Failed to create build, Error:), but the grep only matches Build not created. Either update the description or extend the grep — see inline comment.
  2. The fix is reactive, not root-cause. The original Missing Percy token was almost certainly caused by a fork PR (secrets.PERCY_TOKEN is always empty for pull_request events from forks). This PR converts that silent pass into a red cross, which is good, but every fork contributor's PR will now go red on the Percy step. Worth a follow-up to tighten the if: condition so the step is skipped (not failed) on forks.
  3. A couple of inline nits below.

env:
PERCY_TOKEN: ${{ secrets.PERCY_TOKEN }}
run: |
set -o pipefail
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set -o pipefail doesn't actually buy us anything here. The whole premise of this PR is that npm run percy-test exits 0 even on failure, so the left side of the pipe never returns non-zero from Percy. pipefail only catches tee itself failing, which won't happen on a hosted runner. Not harmful, but it can mislead future readers into thinking it provides real protection — consider removing it or adding a brief comment explaining what it's actually for.

echo "::error::PERCY_TOKEN is not set or empty"
exit 1
fi
npm run percy-test 2>&1 | tee /tmp/percy.log
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: ${{ runner.temp }}/percy.log is more portable if this workflow ever runs on Windows or self-hosted runners. Not a blocker on the current Linux setup.

exit 1
fi
npm run percy-test 2>&1 | tee /tmp/percy.log
if grep -q "\[percy\] Build not created" /tmp/percy.log; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things here:

  1. The PR description promises Failed to create build and Error: are also caught, but only Build not created is checked. Looking at the failing run, Failed to create build actually appears before Build not created, so it's the earlier and more useful signal. Suggest:
if grep -qE '\[percy\] (Build not created|Failed to create build|Error:)' /tmp/percy.log; then
  1. This string-matching approach is inherently fragile — any wording change in a future Percy CLI release will silently break the check, putting us right back in the silent-failure hole this PR is meant to fix. A short comment in the workflow noting this brittleness (and pointing to where the strings come from) would help the next maintainer. Even better long-term would be opening an upstream issue on @percy/cli asking for a non-zero exit on build creation failure.


- name: Percy Test
run: npm run percy-test
if: ( github.ref == 'refs/heads/main' || github.repository == github.event.pull_request.head.repo.full_name) && github.actor != 'dependabot[bot]'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this PR, but worth flagging: this condition is the real reason we hit Missing Percy token in the first place. For pull_request events from a fork, secrets.PERCY_TOKEN is always empty regardless of repo settings, and the github.repository == github.event.pull_request.head.repo.full_name check is meant to skip those — but it's brittle (for push events github.event.pull_request is null, and the comparison only works by coincidence). After this PR merges, fork PRs will go from silent green to hard red on every push. Probably worth a follow-up to make the skip explicit, e.g.:

if: >
  github.actor != 'dependabot[bot]' &&
  ((github.event_name == 'push' && github.ref == 'refs/heads/main') ||
   (github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Patch production bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants