fix: seqlock writer-enter missing Release fence — torn reads on weak memory (#40) by toloco · Pull Request #77 · toloco/warp_cache

toloco · 2026-06-17T13:46:31Z

Closes #40

What & why

ShmSeqLock::write_lock() publishes the odd ("writer active") sequence number with seq.store(prev+1, Release) and then the caller mutates data. A Release store orders only the operations that precede it — it places no constraint on the data writes that follow. So on weak-memory hardware (ARM64/PPC) the data writes can become globally visible before the odd-seq store propagates.

A reader can then:

read_begin → load seq, see even (old),
read mutated data (floated ahead of the odd publish),
read_validate → still see the same even seq (odd store not yet visible) → validates a torn read as consistent.

That's wrong results returned through the safe cross-process backend, not merely an extra retry. (x86 is TSO, so it can't manifest there — which is why this is medium, weak-memory-only.)

The fix

Add atomic::fence(Release) immediately after the odd-seq store — the textbook seqlock writer-enter:

seq.store(prev + 1, Ordering::Release);
std::sync::atomic::fence(Ordering::Release);  // <-- #40

The release fence orders the odd publish before the following data writes. Paired with the reader's existing fence(Acquire) in read_validate, this forms a synchronizes-with edge: a reader that observes any mutated data is guaranteed to also observe the odd seq, so it retries instead of validating. The exit side (write_unlock's Release store) was already correct, but it only covers the even publish — it can't cover the entry side.

Test — model-checked with `loom`

A behavioral test can't prove this: it can't manifest on x86, and the ARM64 window is effectively impossible to hit deterministically. So I added a loom model of the seqlock's reader/writer ordering (src/shm/lock.rs, behind #[cfg(loom)]). Loom exhaustively explores all interleavings and weak-memory reorderings:

without the fence: loom finds an execution that validates a torn read — assertion left == right failed ... left: 0, right: 1 (one data word old, one new)
with the fence: no such execution exists — passes

RUSTFLAGS="--cfg loom" cargo test --lib seqlock_ordering

loom is a cfg(loom)-only dependency ([target.'cfg(loom)'.dependencies]) — absent from normal, release, and CI builds; the model module is #[cfg(loom)] so it's excluded from the regular cargo test. Verified locally by toggling the model's fence (fail → pass).

Gates run (risky change — `src/shm/lock.rs`, concurrency/locking)

make fmt / make lint (ruff, ty, clippy -D warnings) ✓ — loom module excluded under normal cfg
make test — cargo test (11) + pytest (92) ✓
make test-matrix — Python 3.10–3.13 ✓ (3.14 skipped locally via the documented uv-resolves-stale-alpha guard; CI covers 3.14 final)
make bench — no regression: shared backend ~9.0M ops/s, hit-rate 72.9% (the fence is one dmb ish per insert, dominated by the existing lock + serialization)
loom: RUSTFLAGS="--cfg loom" cargo test --lib seqlock_ordering ✓ (and fails as expected without the fence)

🤖 Generated with Claude Code

…40) write_lock() published the odd ("writer active") sequence number with a Release store and then mutated data. A Release store orders only the operations that *precede* it, so on weak-memory hardware (ARM64) the subsequent data writes can float ahead of the odd-seq store. A reader can then observe mutated data while seq still reads even at both read_begin and read_validate, and falsely validate a torn read — wrong results returned through the safe cross-process backend, not just an extra retry. (x86 is TSO, so it cannot manifest there.) Add `atomic::fence(Release)` immediately after the odd-seq store — the textbook seqlock writer-enter. The release fence orders the odd publish before the following data writes; paired with the reader's existing `fence(Acquire)` in read_validate, a reader that observes any mutated data is guaranteed to also observe the odd seq and retry. The exit side (write_unlock's Release store) was already correct but only covers the even publish, not the entry side. Add a loom model of the seqlock's reader/writer ordering (gated behind `cfg(loom)`, so loom is not a normal/release/CI dependency). Loom exhaustively explores interleavings and reorderings: without the fence it finds an execution that validates a torn read (d0=0, d1=1); with the fence, no such execution exists. Run with `RUSTFLAGS="--cfg loom" cargo test --lib seqlock_ordering`. Document the invariant in ARCHITECTURE. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

toloco and others added 3 commits June 17, 2026 14:46

Merge branch 'master' into fix/40-seqlock-release-ordering

e75036c

Merge branch 'master' into fix/40-seqlock-release-ordering

507e78c

toloco merged commit 7cfd362 into master Jun 18, 2026
14 checks passed

toloco deleted the fix/40-seqlock-release-ordering branch June 18, 2026 09:06

toloco mentioned this pull request Jun 18, 2026

fix: validate ttl_nanos when reusing a shared region across processes (#42) #80

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: seqlock writer-enter missing Release fence — torn reads on weak memory (#40)#77

fix: seqlock writer-enter missing Release fence — torn reads on weak memory (#40)#77
toloco merged 3 commits into
masterfrom
fix/40-seqlock-release-ordering

toloco commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

toloco commented Jun 17, 2026

What & why

The fix

Test — model-checked with loom

Gates run (risky change — src/shm/lock.rs, concurrency/locking)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Test — model-checked with `loom`

Gates run (risky change — `src/shm/lock.rs`, concurrency/locking)