Skip to content

test(e2e): add Playwright foundation and PR testing/description skills#30

Merged
nojibe merged 11 commits into
mainfrom
claude/agent-skills-pr-testing-cd7abs
Jun 30, 2026
Merged

test(e2e): add Playwright foundation and PR testing/description skills#30
nojibe merged 11 commits into
mainfrom
claude/agent-skills-pr-testing-cd7abs

Conversation

@nojibe

@nojibe nojibe commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Stands up a deterministic, version-controlled end-to-end testing layer (@playwright/test) and three Claude Code skills that help test and document changes as PRs are created. The e2e suite is the high-value backbone; the skills drive it.

Motivation / context

There was no e2e layer and no Claude Code skills in the repo. Agentic browser automation alone isn't a substitute for a regression suite, so this adds committed Playwright specs that run in CI, plus skills that orchestrate the existing pnpm checks and author PR descriptions.

Changes

  • E2E foundation
    • playwright.config.ts — boots pnpm dev, pre-warms /about as the readiness probe, retries + trace + video on CI. Auto-detects the sandbox Chromium at /opt/pw-browsers and falls back to the bundled browser locally/in CI.
    • tests/e2e/smoke.spec.ts (+ README.md) — smoke tests on /about, a static, secret-free route, so they pass in CI without storage/API keys.
    • .github/workflows/e2e.yml — runs the suite on PRs/pushes (safe pull_request trigger), uploads the HTML report.
    • scripts/pr-screenshots.mjs — fail-soft full-page screenshotter reused by pr-describe.
    • package.json — adds test:e2e, test:e2e:ui, test:e2e:report, pr:screenshots; @playwright/test@1.55.0 devDep.
  • Skills (.claude/skills/<name>/SKILL.md)
    • pr-check — full quality gate (typecheck, lint, web/CLI tests, blueprint validation) → pass/fail summary.
    • e2e-pr — runs the Playwright suite scoped to a PR's changed routes.
    • pr-describe — structured PR description from the diff, with static-gated, security-guarded before/after screenshots.
  • .gitignore — track .claude/skills, ignore Playwright output dirs.

Test plan

  • pnpm test:e2e ✅ — 2 smoke tests pass (verified locally; sandbox Chromium, dev-server boot, /about render).
  • scripts/pr-screenshots.mjs ✅ — captured a full-page PNG of /about.
  • CI workflow installs Chromium and runs the suite on this PR.

Risks / rollback

Low-risk and additive — no application/runtime code is changed; this only adds test infra, a CI workflow, and skill docs. The new workflow uses pull_request (no secrets, read-only token for forks), not pull_request_target. Revert this PR to undo.

Related Issues

None.


🤖 Generated with Claude Code


Generated by Claude Code

claude added 9 commits June 30, 2026 18:10
Five Claude Code skills for quality-gating PRs in this Next.js/TypeScript project:

- test-pr: run Jest tests scoped to files changed vs main
- typecheck-pr: tsc --noEmit, surfaces errors new to the branch
- lint-pr: Next.js ESLint on changed files, with optional --fix
- blueprint-validate: validates Weval blueprint YAML structure and staging limits
- pr-check: orchestrates all checks and produces a structured summary table

Also updates .gitignore to use .claude/* + !.claude/skills so skill files are tracked while the rest of .claude/ stays ignored.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Change .claude → .claude/* so the !.claude/skills negation takes effect.
Git cannot negate inside an ignored directory; switching to a glob on
the directory's contents allows the exception to work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Stand up a deterministic, version-controlled e2e layer (best-practice
@playwright/test specs in CI) and two skills that drive it as PRs are created.

Foundation:
- playwright.config.ts: boots `pnpm dev`, pre-warms /about as the readiness
  probe, retries+trace+video on CI. Auto-detects the sandbox Chromium at
  /opt/pw-browsers and falls back to bundled browser locally/in CI.
- tests/e2e/smoke.spec.ts: smoke tests on /about (static, secret-free → green
  in CI without storage/API keys). README documents env caveats + conventions.
- .github/workflows/e2e.yml: runs the suite on PRs/pushes, uploads HTML report.
- scripts/pr-screenshots.mjs: fail-soft full-page screenshotter reused by the
  PR-description skill.
- package.json scripts: test:e2e, test:e2e:ui, test:e2e:report, pr:screenshots.

Skills:
- e2e-pr: run the suite scoped to a PR's changed routes and report.
- pr-describe: generate a PR description from the diff and, for visual changes,
  attach before/after screenshots (base captured in an isolated git worktree).
  Strictly time-boxed and fail-soft — falls back to a text-only description.

Verified locally: both smoke tests pass and the screenshot script captures a
full-page PNG via the sandbox Chromium.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
The repo is public, so committed screenshots are permanent and world-readable.
Bake in non-negotiable rules: route denylist (/admin*, /api*, auth routes),
secret-free capture env only, no false "delete before merge" comfort, CI-artifact
hosting for anything potentially sensitive, and human review before pushing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Screenshots only pay off on static, secret-free routes; data-driven pages render
empty/mock against the local dev env. Default to a static allowlist (/about,
/what-is-an-eval) and skip everything else with a note, overridable via --routes.
Filters run in order: route mapping -> security denylist (non-overridable) ->
static gate (overridable) -> dynamic-route skip. If nothing survives, fall back
to a text-only description without booting servers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Drop test-pr, typecheck-pr, lint-pr, and blueprint-validate. pr-check already
orchestrates typecheck/lint/tests directly via their pnpm commands, so the
standalone variants were redundant. Remaining skills: pr-check, e2e-pr, pr-describe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Borrowed from common PR-description skills: an explicit blast-radius + how-to-undo
section so reviewers see what to scrutinize. Kept honest — one line for low-risk,
self-contained changes rather than padding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Related Issues supports Closes/Refs linking (auto-closes issues on merge). The
worked example anchors the model to consistent output and shows all sections
filled in for a small UI change. Test plan + Risks/rollback retained.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
Claude Code only auto-discovers skills at .claude/skills/<name>/SKILL.md with
YAML frontmatter (name + description). The flat .md files were not discoverable.
Move pr-check, e2e-pr, and pr-describe into that structure and add trigger-
oriented descriptions so the harness surfaces them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
@railway-app railway-app Bot temporarily deployed to weval / app-pr-30 June 30, 2026 20:29 Destroyed
pnpm/action-setup@v4 errors when both `version` and package.json's
`packageManager` specify a pnpm version. Drop the explicit version: 9 and let
the action read pnpm@9.6.0 from packageManager.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
@railway-app railway-app Bot temporarily deployed to weval / app-pr-30 June 30, 2026 20:52 Destroyed
@nojibe nojibe changed the title Add Playwright e2e foundation and PR testing/description skills test(e2e): add Playwright foundation and PR testing/description skills Jun 30, 2026
Produce Conventional Commits titles (type(scope): summary), inferring type/scope
from the diff. The skill now sets/updates the PR title as well as the body.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011rZDvWQFXH3m42kYMzEY2g
@railway-app railway-app Bot temporarily deployed to weval / app-pr-30 June 30, 2026 20:57 Destroyed
@nojibe nojibe merged commit 461550b into main Jun 30, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants