p2 of recovery_pause_on_logical_slot_conflict (auto-resume) by NikolayS · Pull Request #30 · NikolayS/postgres

NikolayS · 2026-04-22T18:07:37Z

No description provided.

Add a new GUC, recovery_pause_on_logical_slot_conflict (PGC_SIGHUP, default off). When enabled, WAL replay on a standby pauses instead of invalidating an active logical replication slot whose catalog_xmin would be overtaken by a Heap2/PRUNE_ON_ACCESS record's snapshotConflictHorizon. An operator can then drain the slot via pg_logical_slot_get_changes and call pg_wal_replay_resume() to continue. On resume, the patch advances the drained slot's catalog_xmin past the conflict horizon so the subsequent InvalidateObsoleteReplicationSlots call becomes a no-op; replay continues to the next conflict and the cycle repeats. This makes logical decoding from an archive-only standby (no streaming replication link to the primary) viable for continuous CDC. Without this GUC, slots on such standbys are invalidated the first time replay applies a catalog vacuum record whose horizon exceeds the slot's catalog_xmin — typically ~2 * autovacuum_naptime after slot creation. Hooks into ResolveRecoveryConflictWithSnapshot(), the single choke point in the replay path for RS_INVAL_HORIZON conflicts, via a new MaybePauseOnLogicalSlotConflict() function. Reuses the existing SetRecoveryPause / recoveryNotPausedCV machinery — no new shared-memory state. Hot path when GUC off is one boolean early-return. Edge cases handled: - Slots still inside DecodingContextFindStartpoint (effective_catalog_xmin not yet valid) are skipped. Pausing for them would deadlock: snapbuild needs WAL to advance, pause holds it back. Invalidating an in-progress slot is harmless — the caller retries. - Pause-check uses TransactionIdPrecedesOrEquals to match the semantics of DetermineSlotInvalidationCause. Without that, a slot whose catalog_xmin was just advanced to horizon+1 by a previous pause cycle would fail to re-pause on a subsequent record with horizon == horizon+1, yet would still be invalidated by the fall-through. - CheckForStandbyTrigger() is called in the wait loop so pg_promote() does not stall while paused. Mirrors the existing recoveryPausesHere escape loop. - Synced slots (data.synced == true, i.e. managed by the slot-sync worker per sync_replication_slots) are skipped in both the pause-check and advance scans. Writing to their fields from the startup process would race with the slot-sync worker, and ALTER / DROP_REPLICATION_SLOT on a synced slot errors out — so the operator-facing "drain or drop" recipe does not apply. ConfirmRecoveryPaused() and CheckForStandbyTrigger() are made extern for use by MaybePauseOnLogicalSlotConflict's wait loop — the pause is entered from inside ResolveRecoveryConflictWithSnapshot rather than the main replay loop, so we need to transition RECOVERY_PAUSE_REQUESTED -> RECOVERY_PAUSED ourselves and consume PROMOTE_SIGNAL_FILE ourselves. Known limitation: the advance marks slots dirty but does not force an immediate SaveSlotToPath. If the standby crashes between resume and the next restartpoint, the advance is lost — on restart replay re-encounters the same conflict record, re-pauses, and the operator re-drains (idempotent). A future iteration could tighten this.

10 assertions, ~30 wallclock seconds. Two-phase flow: Phase 1 sets up an archive-only standby from a clean basebackup + pg_log_standby_snapshot and creates logical slots on TWO standbys while the archive contains no catalog-prune records. One standby has the GUC on, the other off. Phase 2 then runs catalog- churning workload on the primary (transient tables + VACUUM on pg_class, pg_attribute, pg_type, pg_depend, pg_statistic) and waits for those segments to archive. When the standbys replay through those segments, the GUC-on one pauses; a Perl orchestrator drains the slot with pg_logical_slot_get_changes and calls pg_wal_replay_resume. The GUC-off baseline standby lets its slot invalidate — the upstream default behavior, unchanged. A third standby is created after Phase 2 archives (so its replay will pause quickly on first conflict record). The test then calls pg_promote(wait=>true, wait_seconds=>30) on the paused standby and asserts that promote returns true in under 10 seconds. Guards the CheckForStandbyTrigger() escape path — without that, pg_promote stalls for the full wait_seconds and returns false. Assertions: ok 1 - GUC is registered ok 2 - slot created cleanly in Phase 1 (GUC on, state: reserved) ok 3 - baseline slot created cleanly in Phase 1 (GUC off, reserved) ok 4 - slot survived catalog prune with GUC on (reserved) ok 5 - at least one pause event was handled ok 6 - at least 2000 decoded events ok 7 - baseline (GUC off): slot invalidates as expected (lost) ok 8 - promote-test standby reached paused state before promotion ok 9 - pg_promote returned true while standby was paused by GUC ok 10 - pg_promote completed in under 10s

Extract four named subs so the top-level script reads as a sequence of phases rather than one long procedure. No behavior change: all 10 assertions are preserved verbatim, as are the load-bearing comments (two-phase rationale, double pg_switch_wal rationale, GUC-off baseline rationale, pg_promote escape-path rationale). Helpers extracted: * setup_primary_with_clean_archive * create_archive_standby * run_catalog_churn * drain_and_resume_loop * wait_for_replay_paused

The previous behavior under recovery_pause_on_logical_slot_conflict required the operator to both drain (or drop / advance) the slot AND call pg_wal_replay_resume() to continue — two steps, even though the first step is the one that matters semantically. That split also meant the feature couldn't underpin a continuous-CDC service without external orchestration to issue the resume. Lift the scan predicate ("does any slot in `dboid` still block this conflict?") out of the initial check into a helper AnySlotStillBlocksConflict(). Call it again every 1s inside the existing wait loop. When it returns false, flip the pause state to NOT_PAUSED and let the loop exit; the existing post-wait advance then bumps catalog_xmin past the horizon on drained slots so the fall-through InvalidateObsoleteReplicationSlots() is a no-op. "No longer blocking" covers every unblock path, not just drain: * drained past the pause LSN (confirmed_flush >= captured conflict_lsn) — the main case * slot dropped (pg_drop_replication_slot) — removed from the scan * slot advanced (pg_replication_slot_advance) — catalog_xmin moves past the horizon * slot invalidated for another reason (e.g. RS_INVAL_WAL_REMOVED from max_slot_wal_keep_size, applied by the checkpointer, which runs even while the startup process is asleep in our wait loop) — data.invalidated != RS_INVAL_NONE, scan skips it Manual pg_wal_replay_resume() still works as the "give up on this slot and let it invalidate" escape hatch, and CheckForStandbyTrigger still breaks the loop for pg_promote(). Capture conflict_lsn once at pause time and reuse it for both the in-wait predicate and the post-wait advance, replacing the redundant second GetXLogReplayRecPtr() call. GUC long_desc, postgresql.conf.sample comment, and the xlogrecovery.c variable-decl comment updated to describe auto-resume.

Combines PR 27 (pause-on-conflict + TAP test) and PR 30 (refactor + auto-resume) into the story we would send to -hackers. Covers motivation, mechanism, edge cases, known limitations, tests, files touched, and open questions (GUC name, single-vs-mode flag, persistence, scope). Draft only — not sent.

NikolayS added 2 commits April 22, 2026 11:04

NikolayS changed the title ~~p2 of recovery_pause_on_logical_slot_conflict~~ p2 of recovery_pause_on_logical_slot_conflict (auto-resume) Apr 22, 2026

claude added 2 commits April 22, 2026 18:17

NikolayS force-pushed the claude/document-test-steps-Kh1ty branch from 69fb1b6 to 39adedd Compare April 22, 2026 18:17

NikolayS force-pushed the rfc-v1-recovery-pause-on-slot-conflict branch from ffd897c to 0ce8b52 Compare May 27, 2026 17:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2 of recovery_pause_on_logical_slot_conflict (auto-resume)#30

p2 of recovery_pause_on_logical_slot_conflict (auto-resume)#30
NikolayS wants to merge 5 commits into
rfc-v1-recovery-pause-on-slot-conflictfrom
claude/document-test-steps-Kh1ty

NikolayS commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NikolayS commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants