Skip to content

Session history: synchronous navigations that jump the queue race the paused apply-the-history-step run’s step bookkeeping #12576

@sideshowbarker

Description

@sideshowbarker

What is the issue with the HTML Standard?

“Apply the history step” runs within the session history traversal queue, and the sync navigations jump queue mechanism lets queued synchronous navigation steps run while such a run is paused waiting for its queued tasks. The nested synchronous navigation mutates the same session-history state the paused run is operating on, and as far as I can tell, strictly following the spec requirements as currently written then produces three problems:

  1. Duplicate step numbers. The nested navigation’s finalize steps compute targetStep as “traversable’s current session history step + 1” — but the paused outer run has already appended its own target entry computed from the same current step (its step 20, which advances the current step, has not run yet). Both entries end up with the same step number.

  2. A navigation’s entry is silently deleted. The nested finalize’s “clear the forward session history” removes every entry with a step greater than the current step — which includes the paused outer run’s just-appended entry. Observably, per the spec as currently written, two rapid pushState() calls can yield a single new entry (history.length grows by one) — while engines in practice retain both entries.

  3. The current step moves backwards. When the paused outer run resumes and reaches step 20 (“Set traversable’s current session history step to targetStep”), it writes its older targetStep unconditionally — moving the current step backwards past the step the nested run already committed. The next push-type navigation then computes a step number that an existing entry already holds.

A concrete trigger: a push-type apply-the-history-step run is paused waiting for its queued tasks (for example the bookkeeping step for a newly created child navigable, or a cross-document load mid-apply), and a history.pushState() from the page jumps the queue in that window.

Implementations

In Ladybird, the problems described above manifested as recurring assertion failures in our CI, as documented in LadybirdBrowser/ladybird#10028). And after we ran into those problems — and after a lot of trial-and-error — I ended up cobbling together some ad-hoc bookkeeping additions for this in LadybirdBrowser/ladybird#10029.

Those changes keep the bookkeeping coherent by claiming step numbers past outstanding uncommitted steps, committing the current step only from the newest run, and sparing claimed entries from clear-the-forward-session-history. Which also feels adjacent to the bookkeeping concerns already acknowledged around entryToReplace containment in #10232.

Anyway, after doing all that, I then took some time to do code inspection on the corresponding code in other engines. And what I found is: As far as I can see, no existing engine strictly implements the traversal-queue/jump-queue model from the spec as actually written — so the actual spec requirements here as currently written have only been getting followed strictly by Ladybird (prior to the LadybirdBrowser/ladybird#10029 changes).

  • WebKit avoids the window structurally — its authoritative back/forward list is mutated in atomic per-event operations on the UI-process main thread (forward-pruning, insertion, and the current-index update happen inside a single addItem), and traversals are synchronous — so a claim/commit split cannot exist there.

  • Chromium and Gecko both do have architectures with comparable multiphase asynchrony, and seem to have each grown defensive bookkeeping of the kind the algorithm would need: Chromium matches in-flight navigations by entry identity (nav_entry_id; see the code comment for crbug.com/900036 about a same-document commit arriving from the renderer while a different navigation is pending, with a TODO noting history.pushState() is still unhandled there), and Gecko tracks a list of in-flight loads by monotonically-increasing load id alongside epoch numbers that drop stale navigations, with its own comments noting residual windows (UpdateIndex() here may update index too early”).

In other words, existing engines either forbid the concurrency or else have also already ended up implementing almost exactly the same kind of bookkeeping I ended up independently implementing in LadybirdBrowser/ladybird#10029.

The current spec as written describes the concurrency — but without the bookkeeping.

Possible directions

  • compute targetStep taking the targets of in-flight (uncommitted) runs into account, so step numbers stay unique;
  • have “clear the forward session history” spare entries belonging to runs that have not yet committed;
  • make step 20 conditional on no newer run having committed (or otherwise define the ordering of commits between a paused run and navigations that jumped past it); or
  • if last-push-wins entry replacement is the intended semantic, specify it explicitly — though current engine behavior (both entries survive) suggests otherwise.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions