fix(web): scroll guard setItem unwrap race (#716) by heavygee · Pull Request #717 · tiann/hapi

heavygee · 2026-05-27T14:40:04Z

Summary

fix(web): reset scroll restoration when sessionStorage quota hits #707 regression: writeScrollRestorationCache temporarily set sessionStorage.setItem = originalSetItem while syncing TanStack's in-memory scroll map. Concurrent scrollRestorationCache.set calls (throttled scroll handler, router onRendered) could hit unwrapped native setItem during that window and throw uncaught QuotaExceededError (fix(web): #707 scroll guard still throws uncaught QuotaExceededError (setItem unwrap race) #716).
Fix: Remove the unwrap entirely. At bootstrap, patch scrollRestorationCache.set to update local cache state and persist only through the guarded storage.setItem wrapper. Hard-reset and post-prune sync both use the patched setter; a recoveryDepth guard prevents nested hard-reset loops (nested fallback uses originalSetItem, not the wrapper).
Tests: 17 unit tests (15 guard + 2 legacy positive repro).

Verification

Full write-up: #716 (comment)

Positive unit repro (deterministic): scrollStorageGuard.legacy-regression.test.ts re-implements the fix(web): reset scroll restoration when sessionStorage quota hits #707 unwrap window and proves it throws QuotaExceededError through native setItem; the fix(web): scroll guard setItem unwrap race (#716) #717 patched path survives the same quota pressure.
Live upstream/main stress (intermittent): Playwright against parallel local hubs filled sessionStorage to quota and stress-scrolled. No uncaught pageerror every run — the production bug is a timing race. fix(web): #707 scroll guard still throws uncaught QuotaExceededError (setItem unwrap race) #716 production stacks and stack traces remain the real-world proof upstream was broken.
Dev harnesses: scripts/dev/scroll-quota-repro-playwright.mjs, scroll-quota-prove-positive.mjs, scroll-quota-ab-compare.mjs. Same scripts/dev/ pattern as read-hapi-web.mjs (fix(codex): improve web rendering for Codex events #544); defaults to http://127.0.0.1:3006. Not wired to CI.
Operator deploy: fix built and tested on a private stack before merge; no white-screen recurrence under normal session chat use.

Test plan

cd web && bun run test src/lib/scrollStorageGuard.test.ts src/lib/scrollStorageGuard.legacy-regression.test.ts (17/17 pass)
bun typecheck (repo root)
Playwright dev scripts committed (scripts/dev/scroll-quota-*.mjs)
Manual: long-lived browser profile, scroll + navigate near quota

Issues

Fixes #716

Patch scrollRestorationCache.set to persist only through the guarded sessionStorage wrapper instead of temporarily restoring native setItem. Adds concurrent-write tests for the tiann#707 regression. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions

Findings

[Major] Nested recovery can recursively re-enter the guarded setter when the fallback {} write also fails, risking a stack overflow/freezing the UI on storage-disabled or still-full sessionStorage. Evidence: web/src/lib/scrollStorageGuard.ts:71.
Suggested fix:
```
if (recoveryDepth.current > 0) {
    try {
        originalSetItem.call(storage, STORAGE_KEY, '{}')
    } catch {
        // nested recovery already in progress
    }
    return
}
```

Summary
Review mode: initial
One issue found in the new nested recovery path. The focused tests/typecheck were not run successfully here because bun is not installed in this runner.

Testing
Not run (automation: bun command unavailable in this environment).

HAPI Bot

github-actions · 2026-05-27T14:46:39Z

    // write throws again. Clear the library cache so persisted size matches.
+    if (recoveryDepth.current > 0) {
+        try {
+            storage.setItem(STORAGE_KEY, '{}')


[Major] Nested recovery can re-enter the guarded setter

When scrollRestorationCache.set(() => ({})) is already inside recovery, this branch sends the fallback write through storage.setItem again. If sessionStorage still rejects the tiny {} write, wrappedSetItem catches it, retries, calls hardResetScrollRestorationPersistedState, and lands back in this same recoveryDepth.current > 0 branch. That can recurse until the stack overflows instead of cleanly giving up.

Suggested fix:

if (recoveryDepth.current > 0) { try { originalSetItem.call(storage, STORAGE_KEY, '{}') } catch { // nested recovery already in progress } return }

Document how tiann#716 was verified: Vitest positive repro of the tiann#707 unwrap race, plus Playwright dev harnesses for live A/B stress (same pattern as scripts/dev/read-hapi-web.mjs from tiann#544). Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions

Findings

[Major] Nested recovery still re-enters the guarded setter — this remains on the current PR head. If the fallback {} write is rejected while recoveryDepth.current > 0, storage.setItem calls the wrapper again and can recurse until stack overflow/UI freeze. Evidence web/src/lib/scrollStorageGuard.ts:71.
Suggested fix:
```
if (recoveryDepth.current > 0) {
    try {
        originalSetItem.call(storage, STORAGE_KEY, '{}')
    } catch {
        // nested recovery already in progress
    }
    return
}
```
[Major] Hard-coded non-local default can expose HAPI_ACCESS_TOKEN — when the env token is present, the script stores it for BASE_URL and then navigates there; without HAPI_URL, that target is a committed external origin. Evidence scripts/dev/scroll-quota-repro-playwright.mjs:14.
Suggested fix:
```
const BASE_URL = process.env.HAPI_URL
if (!BASE_URL) {
    console.error('HAPI_URL is required')
    process.exit(2)
}
```
[Minor] A/B parser selects an inner JSON object — the repro prints pretty JSON with nested objects, so lastIndexOf('{') starts inside the payload and JSON.parse sees trailing braces. The harness then reports parseError instead of the child result. Evidence scripts/dev/scroll-quota-ab-compare.mjs:57.
Suggested fix:
```
const stdout = proc.stdout.trim()
parsed = stdout
    ? JSON.parse(stdout)
    : { ok: false, parseError: true, stdout: '', stderr: proc.stderr }
```

Summary
Review mode: follow-up after new commits
The previous nested recovery issue is still present, and the added dev repro tooling has two correctness/security problems that can make verification misleading or leak a token to the wrong origin.

Testing
Not run (automation: bun is unavailable in this runner). git diff --check HEAD^ HEAD -- ... passed.

HAPI Bot

github-actions · 2026-05-28T03:04:41Z

    // write throws again. Clear the library cache so persisted size matches.
+    if (recoveryDepth.current > 0) {
+        try {
+            storage.setItem(STORAGE_KEY, '{}')


[MAJOR] Nested recovery still re-enters the guarded setter

This fallback runs only while recovery is already active, but it calls storage.setItem, which is the wrapped setter on real sessionStorage. If the tiny {} fallback is still rejected, the wrapper catches it, calls hardResetScrollRestorationPersistedState, returns to this same branch, and can recurse until stack overflow/UI freeze.

Suggested fix:

if (recoveryDepth.current > 0) { try { originalSetItem.call(storage, STORAGE_KEY, '{}') } catch { // nested recovery already in progress } return }

github-actions · 2026-05-28T03:04:41Z

+import { mkdirSync, writeFileSync } from 'node:fs'
+import { resolve } from 'node:path'
+
+const BASE_URL = process.env.HAPI_URL ?? 'https://hapi.tail9944ee.ts.net'


[MAJOR] Do not default a token-bearing repro to a non-local hub

When HAPI_ACCESS_TOKEN is set, the script stores it for BASE_URL before navigating there. With this hard-coded non-local default, running the repro without HAPI_URL sends whatever token is in the environment to that origin. Make the target explicit, or default to localhost only.

Suggested fix:

const BASE_URL = process.env.HAPI_URL if (!BASE_URL) { console.error('HAPI_URL is required') process.exit(2) }

github-actions · 2026-05-28T03:04:41Z

+    )
+    let parsed = null
+    try {
+        const jsonStart = proc.stdout.lastIndexOf('{')


[MINOR] Parse the child result from stdout, not the last {

The repro emits pretty-printed JSON containing nested objects. lastIndexOf('{') starts at an inner object, so JSON.parse(proc.stdout.slice(jsonStart)) sees extra closing braces and falls into parseError, making the A/B harness report no verification even when the child script succeeded.

Suggested fix:

const stdout = proc.stdout.trim() parsed = stdout ? JSON.parse(stdout) : { ok: false, parseError: true, stdout: '', stderr: proc.stderr }

Use originalSetItem in nested recovery to avoid re-entering the guard, fix legacy-regression test types, default repro script to localhost, and parse A/B harness stdout correctly. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions

Findings
No findings at >=80% confidence on the fresh head. Prior HAPI Bot findings for nested recovery recursion, non-local token-bearing repro default, and A/B JSON parsing appear addressed in 6984aca.

Summary
Review mode: follow-up after new commits
Full latest diff reviewed. Residual risk: the scroll guard relies on TanStack Router internals (scrollRestorationCache, functionalUpdate, and tsr-scroll-restoration-v1_3), so future router upgrades need this guard and tests revisited.

Testing
Not run (automation: bun is unavailable in this runner). Source review only; PR reports targeted web tests and root typecheck passing.

HAPI Bot

swear01 · 2026-05-28T08:56:41Z

Hi @heavygee, thanks for the thorough fix and the regression repros — the recoveryDepth guard is a nice touch.

Just a heads-up: there's an overlapping PR #722 that takes a different approach — upgrading @tanstack/react-router to ^1.170.8. In @tanstack/router-core >=1.145.6, scrollRestorationCache is no longer part of the public exports, so the in-memory cache sync path (and the unwrap race it introduced) goes away entirely. Upstream also ships its own try-catch around setItem from that version, so crashes are covered at the library level too.

The tradeoff with #722 is that after a prune the in-memory cache may be momentarily stale (scroll position could be wrong on the next nav), whereas #717 keeps it in sync. If the maintainer prefers the sync guarantee, #717 is the right call — but it would need to account for scrollRestorationCache being missing in newer versions.

Happy to coordinate whichever direction gets picked.

heavygee · 2026-05-28T09:32:41Z

Hi @swear01, thanks for the heads-up and for laying out the tradeoffs so clearly. Really helpful.

I wasn't aware of #722 when I opened this; good to know there's another path in flight.

My read (could be wrong): #717 was mostly trying to close the #707 regression, the temporary setItem unwrap that let concurrent TanStack writes hit native storage and throw uncaught (#716). The recoveryDepth / cache.set patch was my attempt to keep RAM and disk in sync after a prune, which I think was the gap behind #708, but I'm not deeply familiar with TanStack's internals and may be overfitting to what I saw in prod.

If #722's upgrade removes the public scrollRestorationCache export and wraps persist in try/catch, that sounds like it eliminates the failure mode I patched at the source, which I'd honestly prefer to fighting library internals I don't fully understand. I suspect my cache.set patch wouldn't survive that bump anyway, so I'm not attached to that specific mechanism.

The bit I'm less sure about: whether prune-only (without clearing whatever the library keeps in memory on newer versions) is enough to stop repeat quota pressure, or if we'd mostly be trading crashes for silent persist failures / occasional wrong scroll. I haven't tested against 1.170+ myself. That's speculation on my part.

Happy to go whichever way the maintainer prefers:

fix(web): scroll guard setItem unwrap race (#716) #717 as a short-term fix on the current pin if they want something targeted now, or
rebase/coordinate toward chore(web): upgrade @tanstack/react-router to ^1.170.8 #722 and trim this down to proactive pruning + tests on the new API (storageKey etc.), dropping the cache patch.

I can help with either direction or close #717 if #722 covers it. Just didn't want #716 to sit open while I figured out overlap.

Thanks again for offering to coordinate.

swear01 · 2026-05-28T12:15:18Z

Thanks for the detailed breakdown — really appreciate the openness.

To answer your question about repeat quota pressure: in >=1.145.6, TanStack only calls setItem once per page lifetime — in the pagehide event listener (L237–L244), via persistScrollRestorationCache (L42–L52). Navigations only update the in-memory scrollRestorationCache object — no repeated write pressure during a session.

At pagehide, the in-memory cache may have grown large, so our guard intercepts that single write, prunes to 50 entries, and retries — sessionStorage ends up with the 50 most recent positions instead of silently losing everything (which is what upstream's own try-catch does).

The scroll tradeoff is narrow: during a session onRendered reads directly from the in-memory cache (L320), not from sessionStorage, so scroll restoration within a session is unaffected. The pruned entries only matter after a full page reload, where a pruned route scrolls to 0 — same as pre-#707 behaviour.

I think #722 covers the failure mode you were targeting. Feel free to close #717 — and thanks again for the regression tests and repro scripts, they're useful reference regardless of which approach lands.

heavygee · 2026-05-28T16:01:08Z

Closing in favour of #722 — @swear01's breakdown convinced me the router upgrade is the better long-term fix, and my cache.set patch would not survive that bump anyway.

Thanks again for the thoughtful review and for answering the repeat-quota question; really appreciated.

What I am doing locally: keeping fix/web-scroll-guard-unwrap-race as a driver-manifest branch layer until #722 lands on upstream/main, then dropping it and rebuilding soup from main. Daily driver hit the full white-screen QuotaExceededError again without that layer (expected on current upstream #707 code).

Happy to help if anything from the regression tests or repro scripts is useful for #722.

github-actions Bot reviewed May 27, 2026

View reviewed changes

heavygee mentioned this pull request May 28, 2026

fix(web): #707 scroll guard still throws uncaught QuotaExceededError (setItem unwrap race) #716

Open

github-actions Bot reviewed May 28, 2026

View reviewed changes

swear01 mentioned this pull request May 28, 2026

chore(web): upgrade @tanstack/react-router to ^1.170.8 #722

Merged

heavygee mentioned this pull request May 28, 2026

Hub: background session notes without CLI dispatch + web normalize fix heavygee/hapi#4

Open

3 tasks

heavygee closed this May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(web): scroll guard setItem unwrap race (#716)#717

fix(web): scroll guard setItem unwrap race (#716)#717
heavygee wants to merge 3 commits into
tiann:mainfrom
heavygee:fix/web-scroll-guard-unwrap-race

heavygee commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 27, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 28, 2026

Uh oh!

github-actions Bot May 28, 2026

Uh oh!

github-actions Bot May 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

swear01 commented May 28, 2026

Uh oh!

heavygee commented May 28, 2026 •

edited

Loading

Uh oh!

swear01 commented May 28, 2026

Uh oh!

heavygee commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

heavygee commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Test plan

Issues

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

swear01 commented May 28, 2026

Uh oh!

heavygee commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swear01 commented May 28, 2026

Uh oh!

heavygee commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

heavygee commented May 27, 2026 •

edited

Loading

heavygee commented May 28, 2026 •

edited

Loading