Add config validation and a test suite#39
Merged
Conversation
Addresses #19 by validating viewports (positive-integer dimensions, a required name) and domain format in src/config.mjs, surfacing actionable errors before they fail deep inside a Playwright capture. Addresses #18 by adding a node:test suite covering config normalization and validation, the control move semantics, and the compare diff math (padImage/compareSlug, now exported), plus a Lint-and-test GitHub Actions workflow on Node 20 and 22. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`--concurrency=0` previously flowed into the capture batching loop as the step size, so `i += 0` never advanced and capture hung forever with no error; negatives ran the loop backward to the same effect, and `parseInt` silently truncated `8x` to 8. Extract flag parsing into `src/args.mjs` as a testable `toInt` that uses `Number` (rejecting trailing garbage) and enforces a minimum — 1 for concurrency, 0 for stagger (which legitimately disables staggering). It throws an actionable error caught by the CLI's top-level handler instead of calling process.exit, which makes it unit-testable. Also clamp the batch step to a positive value as defense in depth. Fixes HIGH-004. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`capture` already guarded against a `--only` filter that matched nothing,
but `control` and `compare` applied the same filter with no guard: a typo'd
key (`reglance control blgo`) silently filtered the target list to empty and
printed a success-looking summary ("Promoted 0 capture(s)" / "Nothing to
compare").
Factor the filter into a shared `filterTargets` helper in config.mjs that
throws — listing the known keys to help spot the typo — and use it in all
three commands. capture's behavior is now consistent (a non-zero exit via
the CLI's error handler instead of a swallowed exit 0).
Fixes MED-004.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the retry loop exhausted its attempts it set `success = true`, logged
"continuing anyway", and screenshotted whatever was on screen — an error
page, a blank page, a half-loaded page — with no signal. The empty
`catch {}` discarded the navigation error, and `capture` printed
"✅ Capture complete" and exited 0 regardless. A silently-captured error
page then becomes a baseline via `control`, defeating the tool's core
promise, with CI staying green.
Per the D-004 decision (permissive + loud):
- The retry loop now records the real error and pushes the slug to a
`failures` list instead of faking success; best-effort screenshotting is
kept but the slug is marked degraded.
- The outer catch records which slug failed rather than swallowing it.
- `capture` prints a loud summary of every degraded slug and tells the
operator to review before running `control`.
- A new `--fail-on-degraded` flag opts into a non-zero exit so CI can gate
on it; the default still exits 0 to preserve partial/best-effort runs.
The exit-policy decision is factored into a pure, unit-tested
`shouldFailRun` helper; the browser-driven path is covered by manual
verification (capture against an unreachable domain).
Fixes HIGH-002.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
control moved whatever captures existed and printed "Promoted N", never
revealing that the controls for missing slugs were left untouched — so a
partial capture (a failed target, or a --only run) silently produced a
baseline mixing captures from different eras, which compare then trusted.
Per the D-004 decision (permissive + loud, with a manifest):
- control now reports "Promoted N of M" and warns when N < M that the
untouched controls are now stale relative to the freshly promoted ones,
and returns { moved, expected }.
- Each promotion is recorded in controls/manifest.json (new src/manifest.mjs)
with a shared per-run timestamp.
- compare reads the manifest and warns when the baseline mixes controls from
more than one control run, naming the oldest slug.
Also fixes LOW-015: an orphan HTML snapshot (HTML present, PNG absent from a
failed prior run) is now removed rather than left to mispair with a future
PNG.
Fixes HIGH-003, LOW-015.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the control and capture differed in width, compareSlug warned and returned null, omitting the slug entirely — no row, no diff image, no entry in the comparison count. A width change is itself a visual regression (a collapsed layout, an appearing/disappearing scrollbar), so the report showed "all clear" for exactly the pages that changed most, with only a console warning that scrolls past. Generalize padImage to pad onto a common width-and-height canvas (returning the image untouched when it already fits, which also avoids a needless copy), and pad both images that way before pixelmatch. A width change now surfaces as a large diff and the slug appears in the report. The test that asserted a width mismatch returns null is rewritten to assert the new contract. Fixes MED-005. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
compare processed every (target × viewport) pair in a single serial loop, each iteration decoding two PNGs and running pixelmatch synchronously on the main thread — so the whole run was single-core and the event loop was blocked start to finish, while capture (next to it) already ran in parallel. - Run each comparison in a worker_threads pool (src/compare-worker.mjs) that reuses workers off a shared queue, so diffs spread across cores, the main thread stays responsive, and a slow slug doesn't gate the others (HIGH-005). - Bound peak memory by capping the pool: each in-flight comparison holds several large RGBA buffers, so concurrency defaults to CPU-count − 1 and is tunable via --compare-concurrency for very tall pages / constrained runners. padImage now returns the image untouched when no padding is needed, dropping a redundant full-buffer copy (HIGH-006). - Guard the HTML line-diff: skip snapshots over 2MB (which otherwise grind through diffLines for no useful result) and build the diff via array-join instead of repeated string `+=` (MED-008). A full dimension-aware memory budget was left out as a deliberate simplification; the configurable cap plus reduced allocation is the pragmatic bound. An end-to-end compare() test exercises the pool. Fixes HIGH-005, HIGH-006, MED-008. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture ran targets in fixed batches joined by `Promise.all`, with two costs: the stagger delay was `j * staggerDelay` (an absolute delay from batch start), so the last context in each batch waited `(concurrency-1) * stagger` before starting — raising concurrency to go faster grew that tail and re-burst the herd at every batch boundary; and `Promise.all` waited for the whole batch, so one slow target (a page hitting the timeout × retries) blocked the finished contexts from picking up new work. Use a queue-based pool instead: `concurrency` workers each pull the next target as soon as they finish, so throughput is bounded by total work over concurrency rather than by the slowest target per batch, and the stagger is applied once per worker at startup. Failure collection is unchanged. Fixes MED-007. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each capture stacked three overlapping waits: goto with waitUntil
'networkidle' (already idle-bounded), a fixed-cadence auto-scroll (100ms per
500px step — ~3s to scroll a 15000px page regardless of load speed), and a
hand-rolled busy-poll waitForNetworkIdle that re-confirmed idle with a 3s cap
shorter than the 15s goto timeout (so lazy assets could be cut off).
Reconcile MED-006 (drop the redundant double wait, speed the scroll) and
LOW-013 (the post-scroll cap is too short and not configurable) into one
change:
- autoScroll now scrolls to the bottom and repeats only until the page height
stabilizes (50ms between checks, capped at 100 iterations), so short pages
settle almost instantly and tall ones keep going as content loads.
- Replace the busy-poll with a single event-driven
page.waitForLoadState('networkidle') after scrolling, bounded by a
configurable timeout and best-effort (a never-idle page still screenshots).
- goto and settle timeouts are now configurable via a `timeouts` block in
reglance.json (defaults: goto 15000, settle 8000 — up from the old 3000 cap).
Fixes MED-006, LOW-013.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Capture launched Chromium with --ignore-certificate-errors unconditionally, so TLS validation was off for every run — including when pointed at a public HTTPS host, where a swapped or expired cert would be captured silently. (The ignoreHTTPSErrors flag was also passed to launch(), where it has no effect; it belongs on the browser context.) Per the D-002 decision: ignore certificate errors only for local development hosts (localhost, 127.0.0.1, ::1, and .test/.local/.localhost domains), where self-signed certs are normal. For any other host, validation stays on unless the new --insecure flag is passed, and capture warns when it disables checks for a non-local host. ignoreHTTPSErrors is now correctly applied on newContext. Fixes LOW-002. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
LOW-012: capture and compare wrote artifacts with no handling for write failures, so a full disk or read-only output surfaced as an opaque error (or, in capture, was swallowed). ensureOutputDir now fails fast with a clear message when the output directory is not writable, and compare wraps its diff and html-diff writes to report the path and error code (EACCES/ENOSPC/...) instead of a bare stack. INFO-001: a `paths` value that is an absolute URL bypasses the configured domain and is navigated as-is. Per the D-001 decision (treat config as potentially untrusted) this stays supported but is no longer silent — capture now warns which path keys point off the configured domain, so off-domain (and potentially internal-network) capture is a conscious choice. The hosts are not blocked, since local development legitimately uses loopback/.test addresses; the documentation note covers the remaining surface. Fixes LOW-012, INFO-001. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The postinstall ran `playwright install chromium` directly. Because reglance is a dev dependency, that hook runs in every consuming install, and a failed download (air-gapped runner, blocked CDN, proxy) aborted the consumer's entire `npm ci`. The PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD env var set in CI did nothing, because the explicit `playwright install` command ignores it — so CI also re-downloaded Chromium on every leg. - Move the hook to scripts/postinstall.mjs (added to the published `files`): it returns early when PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD is set, otherwise runs the install and, on failure, prints an actionable message but always exits 0 so it can never break a consumer's install. - capture now catches a missing-browser launch error and tells the user to run `npx playwright install chromium`, instead of a raw Playwright stack. - Pin playwright to an exact version so the installed browser revision is deterministic (Dependabot already bumps it). The CI skip flag in test.yml is now meaningful (the comment is finally true). Fixes HIGH-001. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Update the README and example config for the changes in this branch: new --fail-on-degraded / --insecure / --compare-concurrency flags, the configurable `timeouts` block, the off-domain path warning, and the degraded-capture and partial-promotion (manifest) safeguards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Node 20 is approaching end-of-life and Node 24 is the current line; test against 22 (active LTS) and 24. engines stays at >=20 — nothing here requires a newer minimum. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The multi-persona code review writes its artifacts under .reviews/; keep them out of git like the other generated output directories. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix audit findings: 6 high + 5 medium + 4 low/info
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #18. Closes #19.
#19 — Configuration validation
Validates config up front in
src/config.mjsso problems surface with an actionable message instead of failing deep inside a Playwright capture:validateViewports): each must be an object with a non-emptynameand positive-integerwidth/height. A typo'd viewport previously flowed untouched intopage.setViewportSize()and threw an opaque error mid-capture.normalizeDomain): rejects empty/non-string domains and validates the result withnew URL()so a malformed domain is caught before navigation.Scoped intentionally: the original issue's JSON-Schema machinery is overkill for this config shape, and network/permission errors are already handled (capture retries and logs per-target;
mkdirSyncthrows clearly). Those are noted on the issue rather than built.#18 — Test suite (docs were already done)
Adds a
node:testsuite (zero new deps, matches the all-ESM setup) —npm testrunsnode --test:test/config.test.mjs— domain normalization, URL building, viewport validation, andloadConfig(missing file, bad JSON, empty paths, domain override, viewport defaults, pixelmatch merge).test/control.test.mjs— thecontrolmove semantics (captures are moved, not copied), missing-capture skips, and the--onlyfilter.test/compare.test.mjs—padImageandcompareSlug(now exported) against in-memory PNGs: identical → 0%, opposite → 100%, height-mismatch padding, and the missing-control / width-mismatch skips.Tier 3 (Playwright
capture()) is intentionally not automated — it needs a live server + browser and would be flaky. CI runs lint + tests on Node 20 and 22 via.github/workflows/test.yml, withPLAYWRIGHT_SKIP_BROWSER_DOWNLOADset since the unit tests don't need Chromium.All 33 tests pass; lint is clean (the only remaining warnings are pre-existing
consolestatements).🤖 Generated with Claude Code