Skip to content

Discussion: cross-browser visual testing — Playwright expansion + BrowserStack as upgrade path #311

@mmcky

Description

@mmcky

Context

QuantEcon's Sphinx/Jupyter Book sites are now under Playwright-based visual regression testing — see QuantEcon/quantecon-book-theme#390 for the migration to a fixtures-based setup. The current configuration covers two viewports on Chromium only (Desktop Chrome 1280×720 + Pixel 5 emulation 393×851), running headless on Ubuntu.

Issue quantecon-book-theme#387 and DrDrij's comment with GA4 data surfaced the broader question: what cross-browser / cross-resolution coverage do we actually need across QuantEcon properties? Real visitor data shows much wider distribution than our two CI viewports, and the bug that motivated #387 (RHS TOC overflow on long lectures) was originally observed on real desktop Safari.

This issue is for cross-org alignment on testing strategy — affects quantecon-book-theme first, but every QuantEcon site renders through it, so the decision matters for all of them.

Two paths, sequenced

Near-term: expand Playwright (free, low effort)

Playwright already bundles WebKit and Firefox engines and supports any viewport + ~100 device emulation profiles via its devices registry. The current setup uses only Chromium on two viewports — leaving most of Playwright's free capability on the table.

Concrete additions worth making in quantecon-book-theme/playwright.config.ts after #390 lands:

Addition What it catches Cost
webkit project ~80% of Safari-specific CSS bugs (scroll quirks, sticky positioning, some font rendering) — not identical to real iOS Safari but close +1 project in config, ~2× CI runtime, more snapshots to maintain
firefox project Gecko-specific rendering differences (text wrap, table layout edge cases) Same
1920×1080 desktop project Largest single resolution per GA4 — currently untested entirely +1 project, more snapshots
1440×900 desktop project Common laptop resolution Same
iPhone 12 / iPad device profiles Mobile iOS coverage (emulated) +2 projects

This would take CI coverage from 2 viewport/engine combinations to 7–8. Snapshot regeneration is a one-time cost; ongoing maintenance is the regenerate-when-styling-changes loop we already have via /update-snapshots.

Long-term: BrowserStack (paid, when justified)

What Playwright fundamentally can't give us, no matter the config:

  • Real iOS Safari on actual iPhones (Playwright WebKit-on-Linux ≠ real iOS — different font rasterization, real touch behavior, real iOS-only CSS quirks)
  • Real Edge on Windows
  • Real Samsung Internet on Samsung hardware
  • OS-level font rendering differences — macOS, Windows, iOS, Android each rasterize text differently

BrowserStack (or equivalent: Sauce Labs, LambdaTest) plugs into Playwright via @browserstack/automate-cli and lets us add real-device projects to CI.

When this becomes worth the cost:

  • Playwright WebKit consistently fails to catch a Safari regression that real iOS users hit
  • A UX-impacting PR needs high-fidelity sign-off across a known target matrix
  • Lecture pages render visibly differently on Samsung Internet (large fraction of African audience per GA4)

When it's not worth it:

  • Most CSS spacing / colour / typography regressions — Playwright Chromium already catches these
  • Catching theme-level structural bugs — covered by synthetic fixtures + region-level snapshots
  • Once-per-PR automated gating — cost grows fast (per-minute parallel sessions)

Suggested decision sequence

  1. Land near-term Playwright expansion in quantecon-book-theme (separate PR after #390). Cost: one config PR + one snapshot regeneration cycle.
  2. Track for ~2 months — how often does WebKit/Firefox catch something Chromium missed? How often does a real-device bug still slip through?
  3. Then decide on BrowserStack with actual data on the gap. If it's a frequent gap → adopt for an on-demand smoke workflow (manual trigger, not CI gate). If rare → manual real-device testing on the Netlify preview for stake-impacting PRs is cheaper.
  4. Document the chosen setup here on meta so other QuantEcon repos (lecture sites that consume the theme) can adopt the same pattern.

Repos affected

  • QuantEcon/quantecon-book-theme — primary; visual regression already running
  • All lecture sites that consume the theme — would benefit indirectly from theme-level regression coverage, but could also adopt their own per-site Playwright suites
  • QuantEcon/lecture-python-programming.myst, lecture-python.myst, lecture-jax, lecture-python-intro, etc.

Cross-references

cc @DrDrij @jstac @mmcky

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions