Summary
When a block height is reorged, the orphaned block's transactions/events rows are left behind and the new canonical block's rows are inserted alongside them — producing duplicate rows at the same (block_height, tx_index) (different tx_ids). The Streams events query filters canonicality by blocks.canonical at that height (which both orphaned and canonical rows satisfy), so orphaned rows leak into candidate_events. That inflates the stream_event_index COUNT and makes two distinct events resolve to the same cursor.
Downstream this wedged the L2 decoders: a decode batch with two same-cursor rows fails the decoded_events upsert (ON CONFLICT DO UPDATE command cannot affect row a second time) and the decoder loops forever. (stx_transfer/stx_mint were ~15h behind before the hotfix.)
Root cause
events has no canonical column (only block_height); transactions has neither canonical nor block_hash (only block_height).
reorg.ts only marks blocks.canonical = false — it never touches events/transactions.
- Ingest (
packages/indexer/src/index.ts:367–452) inserts txs/events with onConflict … doNothing and never deletes, so reorged heights accumulate orphaned rows.
- The Streams query (
packages/indexer/src/streams-events.ts, stream_event_index COUNT joining events → transactions) double-counts the orphaned rows → colliding cursors. This affects the Streams events surface itself, not just the decoder.
Evidence
SELECT block_height, tx_index, count(*)
FROM transactions WHERE block_height BETWEEN 8088743 AND 8088760
GROUP BY block_height, tx_index HAVING count(*) > 1;
-- n=2 for blocks 8088744+
Decoder logs: repeated l2_decoder.error … ON CONFLICT DO UPDATE command cannot affect row a second time for l2.stx_transfer.v1 / l2.stx_mint.v1.
Hotfix (shipped)
writeDecodedEvents now de-dupes by cursor before the upsert (commit f195618a) — stops the decoder wedge. Defense-in-depth; does not fix the underlying leak.
Fix — Option A: replace-per-height ingest
blocks already replaces-by-height (onConflict(height).doUpdateSet); make transactions/events do the same.
- T1 (root fix): in the
new_block txn, delete transactions+events at block_height before re-inserting (both *_block_height_idx exist → cheap; atomic in the existing txn). The node only emits canonical blocks, so the height ends up holding exactly the canonical set. Self-healing for future reorgs. During the reorg window there is no canonical block at the height, so the Streams b.canonical join already returns nothing — no leak.
- T2 (one-time cleanup, required): dedupe existing
transactions/events — keep MAX(created_at) per (block_height, tx_index) (later insert = new chain = canonical), delete the rest + their events by tx_id. (Heuristic; precise alternative is re-ingesting affected blocks from the node.)
- T3 (hardening, optional):
reorg.ts also deletes transactions/events at block_height >= fork_point.
- T4: Streams-events test over a reorged-height fixture asserting unique cursors.
Risks / open questions
- Perf of delete-per-block at high catch-up/backfill rates (indexed → cheap, but measure).
- T2's "latest created_at = canonical" heuristic.
- Confirm the Stacks event emitter never POSTs an orphaned block to
/new_block (only canonical). If it can, T1's "incoming = canonical" assumption needs a guard (the handler already does parent-hash checks ~index.ts:340).
Related
- The 90-day L2 backfill (
packages/indexer/src/l2/BACKFILL.md) should run after T1 + T2.
- Surfaced while diagnosing the
stx_lock/decoder rollout.
Summary
When a block height is reorged, the orphaned block's
transactions/eventsrows are left behind and the new canonical block's rows are inserted alongside them — producing duplicate rows at the same(block_height, tx_index)(differenttx_ids). The Streams events query filters canonicality byblocks.canonicalat that height (which both orphaned and canonical rows satisfy), so orphaned rows leak intocandidate_events. That inflates thestream_event_indexCOUNT and makes two distinct events resolve to the same cursor.Downstream this wedged the L2 decoders: a decode batch with two same-cursor rows fails the
decoded_eventsupsert (ON CONFLICT DO UPDATE command cannot affect row a second time) and the decoder loops forever. (stx_transfer/stx_mintwere ~15h behind before the hotfix.)Root cause
eventshas nocanonicalcolumn (onlyblock_height);transactionshas neithercanonicalnorblock_hash(onlyblock_height).reorg.tsonly marksblocks.canonical = false— it never touchesevents/transactions.packages/indexer/src/index.ts:367–452) inserts txs/events withonConflict … doNothingand never deletes, so reorged heights accumulate orphaned rows.packages/indexer/src/streams-events.ts,stream_event_indexCOUNT joiningevents → transactions) double-counts the orphaned rows → colliding cursors. This affects the Streams events surface itself, not just the decoder.Evidence
Decoder logs: repeated
l2_decoder.error … ON CONFLICT DO UPDATE command cannot affect row a second timeforl2.stx_transfer.v1/l2.stx_mint.v1.Hotfix (shipped)
writeDecodedEventsnow de-dupes by cursor before the upsert (commitf195618a) — stops the decoder wedge. Defense-in-depth; does not fix the underlying leak.Fix — Option A: replace-per-height ingest
blocksalready replaces-by-height (onConflict(height).doUpdateSet); maketransactions/eventsdo the same.new_blocktxn, deletetransactions+eventsatblock_heightbefore re-inserting (both*_block_height_idxexist → cheap; atomic in the existing txn). The node only emits canonical blocks, so the height ends up holding exactly the canonical set. Self-healing for future reorgs. During the reorg window there is no canonical block at the height, so the Streamsb.canonicaljoin already returns nothing — no leak.transactions/events— keepMAX(created_at)per(block_height, tx_index)(later insert = new chain = canonical), delete the rest + their events bytx_id. (Heuristic; precise alternative is re-ingesting affected blocks from the node.)reorg.tsalso deletestransactions/eventsatblock_height >= fork_point.Risks / open questions
/new_block(only canonical). If it can, T1's "incoming = canonical" assumption needs a guard (the handler already does parent-hash checks ~index.ts:340).Related
packages/indexer/src/l2/BACKFILL.md) should run after T1 + T2.stx_lock/decoder rollout.