Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions 0001-Fix-FK-fast-path-cache-loss-on-subtransaction-abort.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
From: Nikolay Samokhvalov <nik@postgres.ai>
Subject: [PATCH] Fix FK fast-path cache loss on internal subtransaction abort

ri_FastPathSubXactCallback() unconditionally NULLed the fast-path cache
on any SUBXACT_EVENT_ABORT_SUB, even when the cache's resources (open
relations, slots) were created in the parent transaction's ResourceOwner
and were not released by the subxact abort. This discarded pending FK
existence checks for rows that remain live, silently bypassing FK
constraint enforcement.

The bug is triggered when an AFTER ROW trigger on the FK table causes an
internal subtransaction abort (e.g., PL/pgSQL BEGIN...EXCEPTION...END
block) while the fast-path batch still has unflushed entries.

Fix: track the SubTransactionId at cache-creation time. In the
SubXactCallback, only tear down the cache when the aborting subxact
encompasses the creation point (create_subid >= mySubid), matching the
pattern used by portal and relcache subxact tracking.

---
src/backend/utils/adt/ri_triggers.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index dc89c68..ea17ea8 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -261,6 +261,7 @@ static dclist_head ri_constraint_cache_valid_list;

static HTAB *ri_fastpath_cache = NULL;
static bool ri_fastpath_callback_registered = false;
+static SubTransactionId ri_fastpath_create_subid = InvalidSubTransactionId;

/*
* Local function prototypes
@@ -4188,6 +4189,7 @@ ri_FastPathTeardown(void)
hash_destroy(ri_fastpath_cache);
ri_fastpath_cache = NULL;
ri_fastpath_callback_registered = false;
+ ri_fastpath_create_subid = InvalidSubTransactionId;
}

static bool ri_fastpath_xact_callback_registered = false;
@@ -4202,6 +4204,7 @@ ri_FastPathXactCallback(XactEvent event, void *arg)
*/
ri_fastpath_cache = NULL;
ri_fastpath_callback_registered = false;
+ ri_fastpath_create_subid = InvalidSubTransactionId;
}

static void
@@ -4211,12 +4214,20 @@ ri_FastPathSubXactCallback(SubXactEvent event, SubTransactionId mySubid,
if (event == SUBXACT_EVENT_ABORT_SUB)
{
/*
- * ResourceOwner already released relations. NULL the static pointers
- * so the still-registered batch callback becomes a no-op for the rest
- * of this transaction.
+ * Only tear down if the cache was created within the aborting
+ * subtransaction (or one of its children). SubTransactionIds are
+ * assigned monotonically, so create_subid >= mySubid means the
+ * cache's resources are owned by this subxact's ResourceOwner and
+ * were already released. If the cache was created in a parent
+ * context (create_subid < mySubid), its resources are still live
+ * and the pending FK checks must not be discarded.
*/
- ri_fastpath_cache = NULL;
- ri_fastpath_callback_registered = false;
+ if (ri_fastpath_create_subid >= mySubid)
+ {
+ ri_fastpath_cache = NULL;
+ ri_fastpath_callback_registered = false;
+ ri_fastpath_create_subid = InvalidSubTransactionId;
+ }
}
}

@@ -4255,6 +4266,7 @@ ri_FastPathGetEntry(const RI_ConstraintInfo *riinfo, Relation fk_rel)
16,
&ctl,
HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+ ri_fastpath_create_subid = GetCurrentSubTransactionId();
}

entry = hash_search(ri_fastpath_cache, &riinfo->constraint_id,
131 changes: 131 additions & 0 deletions fk_fastpath_bug_analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# FK Fast-Path Bug Analysis — PostgreSQL 19devel

## Confirmed

Built and tested against upstream/master (commit 84b9d6b). The bug reproduces
exactly as described. INSERT of `(999, 'bad')` succeeds silently despite no
matching PK row. The orphan persists and the constraint is still reported as
`convalidated = true`.

## Actual output

```
WARNING: resource was not closed: relation "pk"
WARNING: resource was not closed: relation "pk_pkey"
WARNING: resource was not closed: TupleDesc 0x7f4c27648380 (16392,-1)
WARNING: resource was not closed: TupleDesc 0x7f4c276493a0 (16386,-1)
INSERT 0 3

a | tag
-----+------
999 | bad
0 | boom
1 | ok

a | tag
-----+-----
999 | bad
```

## Root cause

In `ri_triggers.c`, `ri_FastPathSubXactCallback()` (line ~4208) unconditionally
NULLs `ri_fastpath_cache` and `ri_fastpath_callback_registered` on **any**
`SUBXACT_EVENT_ABORT_SUB`:

```c
static void
ri_FastPathSubXactCallback(SubXactEvent event, SubTransactionId mySubid,
SubTransactionId parentSubid, void *arg)
{
if (event == SUBXACT_EVENT_ABORT_SUB)
{
ri_fastpath_cache = NULL;
ri_fastpath_callback_registered = false;
}
}
```

The comment says "ResourceOwner already released relations" — but that's only
true for resources created **within** the aborting subxact. In the reproduction
case:

1. AFTER ROW triggers fire for row 1 `(999, 'bad')`:
- RI trigger fires first (alphabetical), calls `ri_FastPathBatchAdd()`
- This calls `ri_FastPathGetEntry()`, which opens `pk_rel` and `idx_rel`,
creates slots, and stores them in the cache
- Resources are owned by the **main transaction's ResourceOwner**
- The `zz_fk_after_row_boom` trigger fires — no-op for this row

2. AFTER ROW triggers fire for row 2 `(0, 'boom')`:
- RI trigger fires, adds to batch (cache already exists)
- `zz_fk_after_row_boom` trigger fires, hits `BEGIN...EXCEPTION...END`
- PL/pgSQL creates an internal subtransaction (savepoint)
- `RAISE EXCEPTION` aborts the subxact
- **`ri_FastPathSubXactCallback` fires → NULLs the entire cache**
- The exception is caught, subxact abort is handled, execution continues

3. AFTER ROW triggers fire for row 3 `(1, 'ok')`:
- RI trigger calls `ri_FastPathBatchAdd()` → `ri_FastPathGetEntry()`
- Cache is NULL, so it tries to create a new one
- But the old relations were **never closed** (owned by parent
ResourceOwner, not the aborted subxact) → resource leak warnings

4. At batch end, `ri_FastPathEndBatch()` flushes only the **new** cache
(containing only row 3). The buffered check for row 1 `(999, 'bad')` was
lost when the cache was NULLed.

## The comment is wrong

The design comment (lines 230-231) says:

> on abort, ResourceOwner releases the cached relations and the
> XactCallback/SubXactCallback NULL the static cache pointer

This is correct for **full transaction abort** (`XactCallback`), but wrong for
**subtransaction abort** (`SubXactCallback`). The fast-path resources are created
in the main trigger-firing context, not inside the subxact. Only the subxact's
own ResourceOwner is cleaned up on `SUBXACT_EVENT_ABORT_SUB`.

## Consequences

1. **Silent FK violation**: Orphan rows persist with no error
2. **Resource leaks**: Relations and slots owned by parent transaction are
abandoned (visible as warnings with debug builds)
3. **Constraint still reports valid**: `pg_constraint.convalidated = true` and
`ALTER TABLE ... VALIDATE CONSTRAINT` is a no-op for already-valid constraints
4. **No audit trail**: Nothing in the logs indicates the constraint was bypassed

## Fix direction

The SubXactCallback needs to be smarter about what it clears. Options:

1. **Track subxact nesting level**: Record which SubTransactionId (or nesting
depth) the cache was created under. Only NULL the cache if the aborting
subxact is at or above that level.

2. **Dedicated ResourceOwner**: Create a private ResourceOwner for fast-path
resources attached to the top transaction, so subxact cleanup can't touch
them. Then the SubXactCallback can simply ignore subxact aborts entirely.

3. **Re-flush on subxact abort**: Instead of discarding the cache, force a
flush of all buffered rows before clearing. This is tricky because the
flush itself can error, and error-during-error-recovery is dangerous.

Option 1 seems most aligned with how other Postgres subsystems handle this
pattern (e.g., Portal/Snapshot subxact tracking).

## Email review

The email is technically accurate and the reproduction case is correct. A few
suggestions:

- Add the actual output (especially the resource-leak warnings) — those are
strong corroborating evidence and help reviewers trust the report
- The "security issue" framing is debatable — FK constraints are a data
integrity feature, not an access-control mechanism. The silent bypass is
serious, but "data integrity bug" might land better with -hackers
- Consider noting the PostgreSQL version/commit hash for reproducibility
- The trigger naming trick (sorting after `RI_ConstraintTrigger*`) deserves
a brief explanation of why it matters — not all readers will immediately
see why the name is important
58 changes: 58 additions & 0 deletions fk_fastpath_bug_repro.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
-- Reproduction script for PostgreSQL 19devel FK fast-path bug
-- ri_FastPathSubXactCallback drops pending FK checks on internal subxact abort
--
-- Tested on: PostgreSQL 19devel (upstream/master commit 84b9d6b)
-- Result: INSERT succeeds, orphan row (a=999) persists despite FK constraint

drop table if exists fk;
drop table if exists pk;

create table pk (
id int primary key
);

create table fk (
a int references pk(id),
tag text
);

insert into pk values (0), (1);

create or replace function fk_after_row_boom() returns trigger
as $$
begin
if new.tag = 'boom' then
begin
raise exception 'internal subxact abort';
exception when others then
null;
end;
end if;

return new;
end;
$$
language plpgsql;

-- Trigger name sorts after RI_ConstraintTrigger*, so RI FK trigger buffers first
create trigger zz_fk_after_row_boom
after insert on fk
for each row
execute function fk_after_row_boom();

-- Expected: ERROR for FK violation on a=999
-- Actual (bug): INSERT 0 3 succeeds, plus resource-leak warnings
insert into fk(a, tag)
values
(999, 'bad'),
(0, 'boom'),
(1, 'ok');

-- Shows orphan row if bug is present
select * from fk order by tag;

-- Orphan detection query
select fk.*
from fk
left join pk on pk.id = fk.a
where pk.id is null;