Skip to content

fix(web): scroll guard setItem unwrap race (#716)#717

Closed
heavygee wants to merge 3 commits into
tiann:mainfrom
heavygee:fix/web-scroll-guard-unwrap-race
Closed

fix(web): scroll guard setItem unwrap race (#716)#717
heavygee wants to merge 3 commits into
tiann:mainfrom
heavygee:fix/web-scroll-guard-unwrap-race

Conversation

@heavygee
Copy link
Copy Markdown
Contributor

@heavygee heavygee commented May 27, 2026

Summary

  • fix(web): reset scroll restoration when sessionStorage quota hits #707 regression: writeScrollRestorationCache temporarily set sessionStorage.setItem = originalSetItem while syncing TanStack's in-memory scroll map. Concurrent scrollRestorationCache.set calls (throttled scroll handler, router onRendered) could hit unwrapped native setItem during that window and throw uncaught QuotaExceededError (fix(web): #707 scroll guard still throws uncaught QuotaExceededError (setItem unwrap race) #716).
  • Fix: Remove the unwrap entirely. At bootstrap, patch scrollRestorationCache.set to update local cache state and persist only through the guarded storage.setItem wrapper. Hard-reset and post-prune sync both use the patched setter; a recoveryDepth guard prevents nested hard-reset loops (nested fallback uses originalSetItem, not the wrapper).
  • Tests: 17 unit tests (15 guard + 2 legacy positive repro).

Verification

Full write-up: #716 (comment)

Test plan

  • cd web && bun run test src/lib/scrollStorageGuard.test.ts src/lib/scrollStorageGuard.legacy-regression.test.ts (17/17 pass)
  • bun typecheck (repo root)
  • Playwright dev scripts committed (scripts/dev/scroll-quota-*.mjs)
  • Manual: long-lived browser profile, scroll + navigate near quota

Issues

Fixes #716

Patch scrollRestorationCache.set to persist only through the guarded
sessionStorage wrapper instead of temporarily restoring native setItem.
Adds concurrent-write tests for the tiann#707 regression.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Nested recovery can recursively re-enter the guarded setter when the fallback {} write also fails, risking a stack overflow/freezing the UI on storage-disabled or still-full sessionStorage. Evidence: web/src/lib/scrollStorageGuard.ts:71.
    Suggested fix:
    if (recoveryDepth.current > 0) {
        try {
            originalSetItem.call(storage, STORAGE_KEY, '{}')
        } catch {
            // nested recovery already in progress
        }
        return
    }

Summary
Review mode: initial
One issue found in the new nested recovery path. The focused tests/typecheck were not run successfully here because bun is not installed in this runner.

Testing
Not run (automation: bun command unavailable in this environment).

HAPI Bot

Comment thread web/src/lib/scrollStorageGuard.ts Outdated
// write throws again. Clear the library cache so persisted size matches.
if (recoveryDepth.current > 0) {
try {
storage.setItem(STORAGE_KEY, '{}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Major] Nested recovery can re-enter the guarded setter

When scrollRestorationCache.set(() => ({})) is already inside recovery, this branch sends the fallback write through storage.setItem again. If sessionStorage still rejects the tiny {} write, wrappedSetItem catches it, retries, calls hardResetScrollRestorationPersistedState, and lands back in this same recoveryDepth.current > 0 branch. That can recurse until the stack overflows instead of cleanly giving up.

Suggested fix:

if (recoveryDepth.current > 0) {
    try {
        originalSetItem.call(storage, STORAGE_KEY, '{}')
    } catch {
        // nested recovery already in progress
    }
    return
}

Document how tiann#716 was verified: Vitest positive repro of the tiann#707 unwrap
race, plus Playwright dev harnesses for live A/B stress (same pattern as
scripts/dev/read-hapi-web.mjs from tiann#544).

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Nested recovery still re-enters the guarded setter — this remains on the current PR head. If the fallback {} write is rejected while recoveryDepth.current > 0, storage.setItem calls the wrapper again and can recurse until stack overflow/UI freeze. Evidence web/src/lib/scrollStorageGuard.ts:71.
    Suggested fix:
    if (recoveryDepth.current > 0) {
        try {
            originalSetItem.call(storage, STORAGE_KEY, '{}')
        } catch {
            // nested recovery already in progress
        }
        return
    }
  • [Major] Hard-coded non-local default can expose HAPI_ACCESS_TOKEN — when the env token is present, the script stores it for BASE_URL and then navigates there; without HAPI_URL, that target is a committed external origin. Evidence scripts/dev/scroll-quota-repro-playwright.mjs:14.
    Suggested fix:
    const BASE_URL = process.env.HAPI_URL
    if (!BASE_URL) {
        console.error('HAPI_URL is required')
        process.exit(2)
    }
  • [Minor] A/B parser selects an inner JSON object — the repro prints pretty JSON with nested objects, so lastIndexOf('{') starts inside the payload and JSON.parse sees trailing braces. The harness then reports parseError instead of the child result. Evidence scripts/dev/scroll-quota-ab-compare.mjs:57.
    Suggested fix:
    const stdout = proc.stdout.trim()
    parsed = stdout
        ? JSON.parse(stdout)
        : { ok: false, parseError: true, stdout: '', stderr: proc.stderr }

Summary
Review mode: follow-up after new commits
The previous nested recovery issue is still present, and the added dev repro tooling has two correctness/security problems that can make verification misleading or leak a token to the wrong origin.

Testing
Not run (automation: bun is unavailable in this runner). git diff --check HEAD^ HEAD -- ... passed.

HAPI Bot

Comment thread web/src/lib/scrollStorageGuard.ts Outdated
// write throws again. Clear the library cache so persisted size matches.
if (recoveryDepth.current > 0) {
try {
storage.setItem(STORAGE_KEY, '{}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Nested recovery still re-enters the guarded setter

This fallback runs only while recovery is already active, but it calls storage.setItem, which is the wrapped setter on real sessionStorage. If the tiny {} fallback is still rejected, the wrapper catches it, calls hardResetScrollRestorationPersistedState, returns to this same branch, and can recurse until stack overflow/UI freeze.

Suggested fix:

if (recoveryDepth.current > 0) {
    try {
        originalSetItem.call(storage, STORAGE_KEY, '{}')
    } catch {
        // nested recovery already in progress
    }
    return
}

import { mkdirSync, writeFileSync } from 'node:fs'
import { resolve } from 'node:path'

const BASE_URL = process.env.HAPI_URL ?? 'https://hapi.tail9944ee.ts.net'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Do not default a token-bearing repro to a non-local hub

When HAPI_ACCESS_TOKEN is set, the script stores it for BASE_URL before navigating there. With this hard-coded non-local default, running the repro without HAPI_URL sends whatever token is in the environment to that origin. Make the target explicit, or default to localhost only.

Suggested fix:

const BASE_URL = process.env.HAPI_URL
if (!BASE_URL) {
    console.error('HAPI_URL is required')
    process.exit(2)
}

Comment thread scripts/dev/scroll-quota-ab-compare.mjs Outdated
)
let parsed = null
try {
const jsonStart = proc.stdout.lastIndexOf('{')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MINOR] Parse the child result from stdout, not the last {

The repro emits pretty-printed JSON containing nested objects. lastIndexOf('{') starts at an inner object, so JSON.parse(proc.stdout.slice(jsonStart)) sees extra closing braces and falls into parseError, making the A/B harness report no verification even when the child script succeeded.

Suggested fix:

const stdout = proc.stdout.trim()
parsed = stdout
    ? JSON.parse(stdout)
    : { ok: false, parseError: true, stdout: '', stderr: proc.stderr }

Use originalSetItem in nested recovery to avoid re-entering the guard,
fix legacy-regression test types, default repro script to localhost,
and parse A/B harness stdout correctly.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings
No findings at >=80% confidence on the fresh head. Prior HAPI Bot findings for nested recovery recursion, non-local token-bearing repro default, and A/B JSON parsing appear addressed in 6984aca.

Summary
Review mode: follow-up after new commits
Full latest diff reviewed. Residual risk: the scroll guard relies on TanStack Router internals (scrollRestorationCache, functionalUpdate, and tsr-scroll-restoration-v1_3), so future router upgrades need this guard and tests revisited.

Testing
Not run (automation: bun is unavailable in this runner). Source review only; PR reports targeted web tests and root typecheck passing.

HAPI Bot

@swear01
Copy link
Copy Markdown
Contributor

swear01 commented May 28, 2026

Hi @heavygee, thanks for the thorough fix and the regression repros — the recoveryDepth guard is a nice touch.

Just a heads-up: there's an overlapping PR #722 that takes a different approach — upgrading @tanstack/react-router to ^1.170.8. In @tanstack/router-core >=1.145.6, scrollRestorationCache is no longer part of the public exports, so the in-memory cache sync path (and the unwrap race it introduced) goes away entirely. Upstream also ships its own try-catch around setItem from that version, so crashes are covered at the library level too.

The tradeoff with #722 is that after a prune the in-memory cache may be momentarily stale (scroll position could be wrong on the next nav), whereas #717 keeps it in sync. If the maintainer prefers the sync guarantee, #717 is the right call — but it would need to account for scrollRestorationCache being missing in newer versions.

Happy to coordinate whichever direction gets picked.

@heavygee
Copy link
Copy Markdown
Contributor Author

heavygee commented May 28, 2026

Hi @swear01, thanks for the heads-up and for laying out the tradeoffs so clearly. Really helpful.

I wasn't aware of #722 when I opened this; good to know there's another path in flight.

My read (could be wrong): #717 was mostly trying to close the #707 regression, the temporary setItem unwrap that let concurrent TanStack writes hit native storage and throw uncaught (#716). The recoveryDepth / cache.set patch was my attempt to keep RAM and disk in sync after a prune, which I think was the gap behind #708, but I'm not deeply familiar with TanStack's internals and may be overfitting to what I saw in prod.

If #722's upgrade removes the public scrollRestorationCache export and wraps persist in try/catch, that sounds like it eliminates the failure mode I patched at the source, which I'd honestly prefer to fighting library internals I don't fully understand. I suspect my cache.set patch wouldn't survive that bump anyway, so I'm not attached to that specific mechanism.

The bit I'm less sure about: whether prune-only (without clearing whatever the library keeps in memory on newer versions) is enough to stop repeat quota pressure, or if we'd mostly be trading crashes for silent persist failures / occasional wrong scroll. I haven't tested against 1.170+ myself. That's speculation on my part.

Happy to go whichever way the maintainer prefers:

I can help with either direction or close #717 if #722 covers it. Just didn't want #716 to sit open while I figured out overlap.

Thanks again for offering to coordinate.

@swear01
Copy link
Copy Markdown
Contributor

swear01 commented May 28, 2026

Thanks for the detailed breakdown — really appreciate the openness.

To answer your question about repeat quota pressure: in >=1.145.6, TanStack only calls setItem once per page lifetime — in the pagehide event listener (L237–L244), via persistScrollRestorationCache (L42–L52). Navigations only update the in-memory scrollRestorationCache object — no repeated write pressure during a session.

At pagehide, the in-memory cache may have grown large, so our guard intercepts that single write, prunes to 50 entries, and retries — sessionStorage ends up with the 50 most recent positions instead of silently losing everything (which is what upstream's own try-catch does).

The scroll tradeoff is narrow: during a session onRendered reads directly from the in-memory cache (L320), not from sessionStorage, so scroll restoration within a session is unaffected. The pruned entries only matter after a full page reload, where a pruned route scrolls to 0 — same as pre-#707 behaviour.

I think #722 covers the failure mode you were targeting. Feel free to close #717 — and thanks again for the regression tests and repro scripts, they're useful reference regardless of which approach lands.

@heavygee
Copy link
Copy Markdown
Contributor Author

Closing in favour of #722@swear01's breakdown convinced me the router upgrade is the better long-term fix, and my cache.set patch would not survive that bump anyway.

Thanks again for the thoughtful review and for answering the repeat-quota question; really appreciated.

What I am doing locally: keeping fix/web-scroll-guard-unwrap-race as a driver-manifest branch layer until #722 lands on upstream/main, then dropping it and rebuilding soup from main. Daily driver hit the full white-screen QuotaExceededError again without that layer (expected on current upstream #707 code).

Happy to help if anything from the regression tests or repro scripts is useful for #722.

@heavygee heavygee closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(web): #707 scroll guard still throws uncaught QuotaExceededError (setItem unwrap race)

2 participants