Skip to content

Doom serializable transactions before SSI errors#42

Draft
NikolayS wants to merge 1 commit into
masterfrom
ssi-savepoint-doom
Draft

Doom serializable transactions before SSI errors#42
NikolayS wants to merge 1 commit into
masterfrom
ssi-savepoint-doom

Conversation

@NikolayS

@NikolayS NikolayS commented May 30, 2026

Copy link
Copy Markdown
Owner

Summary

This draft PR preserves Andrey's deterministic serializable-savepoint reproducer and turns the local investigation into a durable branch.

The bug pattern is that an SSI serialization failure can be raised inside a subtransaction and then caught by ROLLBACK TO SAVEPOINT. If the current top-level SERIALIZABLEXACT was not marked SXACT_FLAG_DOOMED before ereport(ERROR), the top-level transaction can still commit earlier writes even though SSI has already decided that this transaction is the victim of a dangerous structure.

The new isolation test demonstrates that directly:

  • s1 reads row 2, writes row 1, and commits
  • s2 writes row 2 before a savepoint
  • s2 reads row 1 inside the savepoint and gets an SSI serialization failure
  • s2 rolls back to the savepoint, preserving its earlier write
  • s2 must still fail at top-level COMMIT

Without the fix, the test-only patch fails: COMMIT succeeds and final state shows both writes committed (2|1).

Why SAVEPOINT recovery is not enough here

Normal savepoint semantics do discard commands executed after the savepoint, so the question is fair: if the failed read were completely erased from the serializable transaction's dependency history, the surviving writes alone could be serialized.

However PostgreSQL SSI deliberately does not model predicate reads as subtransaction-local. src/backend/storage/lmgr/README-SSI says:

Because reads in a subtransaction may cause that subtransaction
to roll back, thereby affecting what is written by the top level
transaction, predicate locks must survive a subtransaction rollback.
As a consequence, all xid usage in SSI, including predicate locking,
is based on the top level xid.

So once a statement-time SSI conflict check identifies the current top-level transaction as the victim, the error needs a durable top-level marker. Otherwise FlagRWConflict() raises before recording the conflict with SetRWConflict(), ROLLBACK TO SAVEPOINT clears the error state, and the top-level transaction can commit despite the SSI victim decision.

The MVCC docs also frame SQLSTATE 40001 handling as retrying the complete transaction, including the application logic that chose which SQL to issue.

Fix

Mark MySerializableXact as doomed before raising statement-time SSI serialization failures where the current transaction is the victim.

This includes Andrey's original OnConflict_CheckForSerializationFailure() paths:

  • current transaction is the writer: Canceled on identification as a pivot, during write.
  • prepared writer cannot be aborted, so current transaction is the reader: Canceled on conflict out to pivot ..., during read.

I also widened the patch to the analogous current-victim paths in CheckForSerializableConflictOut():

  • Canceled on conflict out to old pivot %u.
  • Canceled on identification as a pivot, with conflict out to old committed transaction %u.
  • Canceled on conflict out to old pivot. for prepared summary-conflict-out pivot

Verification

Local branch was based on NikolayS/postgres master at:

0f24332aeb4f43409c2a7bec9fef1e3317689bc5

Commands run:

meson setup build ... -Dtap_tests=enabled
ninja -C build src/backend/postgres src/test/isolation/isolationtester src/bin/initdb/initdb src/bin/pg_ctl/pg_ctl src/bin/psql/psql
meson test -C build --suite setup --print-errorlogs
meson test -C build isolation/isolation --print-errorlogs

Observed results:

  • test-only patch fails as expected: serializable-savepoint allows commit after ROLLBACK TO SAVEPOINT
  • Andrey's original code patch makes the new test pass
  • this widened branch passes isolation/isolation: 130 isolation subtests

Relation to issue #41

This is probably separate from Kyle's original Jepsen INSERT ... ON CONFLICT append anomaly unless there is another hidden savepoint path. Kyle's published command does not request savepoints, and the Jepsen-managed reproduction recorded in issue #41 was run with --no-savepoints. The workload's :on-conflict append path also does not use the savepoint helper; that helper is for the :update-insert-update duplicate-key path when :savepoints is enabled.

So PR #42 should be treated as a deterministic SSI/SAVEPOINT finding and not as an explanation of #41. The next useful check is still to run the Jepsen-managed reproduction from #41 against this branch: if it still reproduces, #41 needs a different root cause.

A serialization failure raised inside a subtransaction can be
caught by ROLLBACK TO SAVEPOINT.  If the top-level serializable
transaction is not marked doomed before ereport(ERROR), it can later
commit with writes that should have been canceled.

Mark the current SerializableXact doomed before statement-time SSI
serialization failures where the current transaction is the victim,
and add an isolation test for the SAVEPOINT case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant