perf(storage): conditional MATERIALIZED CTE fence for ListTransactions (backport v2.4)#1410
perf(storage): conditional MATERIALIZED CTE fence for ListTransactions (backport v2.4)#1410sylr wants to merge 4 commits into
Conversation
A selective wallet-history ListTransactions attached ORDER BY id DESC LIMIT n to the same select as a JSONB @> filter, so the planner chose an abort-early transactions_id_desc walk that scanned ~2.3M rows (16.1s) to return 16. Wrapping the filtered dataset in a MATERIALIZED CTE and moving ORDER BY + LIMIT to the outer select lets the planner pick the GIN BitmapOr over the filtered set (~7.7ms, verified on prod read-replica). The fence is applied conditionally, only when the filter contains a "needle" containment predicate (account/source/destination/metadata), via the new opt-in DatasetFencer interface implemented by the transactions handler. Unfiltered/range-only lists keep the historical non-materialized shape, where abort-early is the faster plan. The Paginator contract is split into ApplyCursorPredicate (keyset/offset predicate that stays inside the fence) and ApplyWindow (LIMIT/OFFSET that move to the outer select); the existing outer ORDER BY remains the single order source and is applied before the window. Non-fence SQL is unchanged. Constraint: JSONB @> selectivity + value<->id correlation can't be fixed by statistics or extended stats; pg_hint_plan not installed Constraint: GIN indexes on sources/destinations/metadata are schema migrations (43/51/52), present on all migrated clusters Rejected: Blanket-wrap every list in a MATERIALIZED CTE | regresses broad/unfiltered lists (materializes millions then sorts) Rejected: Runtime cost probe to detect selectivity | static filter-shape decision is simpler and sufficient per findings doc Confidence: high Scope-risk: moderate Directive: ApplyWindow must NOT emit ORDER BY — the caller applies the qualified outer ORDER BY first; do not fold ordering back in Directive: The fenced dataset CTE must never carry an inner LIMIT/OFFSET, or the abort-early walk re-triggers Directive: Do not extend the moved-LIMIT pattern to a fan-out (non-1:1) expand without re-checking; effectiveVolumes is 1:1 per tx Not-tested: negated-only metadata filter (NOT metadata @>) and non-selective needle values (account="world") fence without benefit but never change results
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release/v2.4 #1410 +/- ##
================================================
- Coverage 80.20% 79.80% -0.40%
================================================
Files 206 206
Lines 11280 11324 +44
================================================
- Hits 9047 9037 -10
- Misses 1598 1603 +5
- Partials 635 684 +49 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughSplits pagination into cursor-keyset predicate vs LIMIT/OFFSET window, adds an opt-in DatasetFencer to emit MATERIALIZED dataset CTEs for fenced queries, refactors ResourceRepository to resolve build context and choose fenced vs unfenced pagination flows, implements transaction fencing heuristics, and adds tests and a gomock. ChangesDataset fencing and pagination refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The new DatasetFencer interface in storage/common/resource.go is picked up by `go generate` (mocks_test.go is generated from resource.go), so the generated mock must be committed or the `Dirty` CI check fails. Directive: mocks_test.go is generated from internal/storage/common/resource.go — re-run `just generate` (mockgen) after changing interfaces there, do not hand-edit
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/storage/ledger/resource_transactions_test.go (1)
1-80:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winMissing
mockUseFiltertype definition causes compilation failure.The test instantiates
&mockUseFilter{filters: tc.filters}at line 75, but the file does not define themockUseFiltertype or its requiredUseFiltermethod. Since this is a new file shown in its entirety (lines 1–80 all annotated with~), the missing definition will cause a compilation error.🐛 Proposed fix: add mockUseFilter implementation
Insert the following definition before the test function:
import ( "testing" "github.com/stretchr/testify/assert" ) +type mockUseFilter struct { + filters map[string][]any +} + +func (m *mockUseFilter) UseFilter(key string, matchers ...func(any) bool) bool { + values, ok := m.filters[key] + if !ok { + return false + } + if len(matchers) == 0 { + return true + } + for _, value := range values { + allMatch := true + for _, matcher := range matchers { + if !matcher(value) { + allMatch = false + break + } + } + if allMatch { + return true + } + } + return false +} + func TestShouldFenceTransactionsDataset(t *testing.T) {This matches the
RepositoryHandlerBuildContext.UseFiltersemantics frominternal/storage/common/resource.go:83-106.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/storage/ledger/resource_transactions_test.go` around lines 1 - 80, The test fails to compile because mockUseFilter is not defined; add a mock implementation named mockUseFilter with a field filters map[string][]any and implement the UseFilter(ctx context.Context) (map[string][]any, error) method (matching the RepositoryHandlerBuildContext.UseFilter semantics) so shouldFenceTransactionsDataset(mock) can call it; place this type and method definition above TestShouldFenceTransactionsDataset and ensure it returns the stored filters and nil error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/storage/ledger/resource_transactions_test.go`:
- Around line 1-80: The test fails to compile because mockUseFilter is not
defined; add a mock implementation named mockUseFilter with a field filters
map[string][]any and implement the UseFilter(ctx context.Context)
(map[string][]any, error) method (matching the
RepositoryHandlerBuildContext.UseFilter semantics) so
shouldFenceTransactionsDataset(mock) can call it; place this type and method
definition above TestShouldFenceTransactionsDataset and ensure it returns the
stored filters and nil error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 91fa43b1-15a0-4962-9900-87511bf94400
📒 Files selected for processing (7)
internal/storage/common/paginator.gointernal/storage/common/paginator_column.gointernal/storage/common/paginator_offset.gointernal/storage/common/resource.gointernal/storage/ledger/resource_transactions.gointernal/storage/ledger/resource_transactions_fence_test.gointernal/storage/ledger/resource_transactions_test.go
…case Addresses a CodeRabbit review note: the metadata filter value should be a string key, not a map. The fence decision only checks filter presence, so behavior is unchanged — this is for semantic accuracy of the test fixture.
|
@coderabbitai — re: the "outside diff range" finding that This is a false positive. type mockUseFilter struct {
filters map[string][]any
}
func (m *mockUseFilter) UseFilter(key string, matchers ...func(any) bool) bool { ... }Both Applying the suggested fix (adding a second No change needed; resolving. |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
|
Addresses @flemzord's review on #1409: in the fenced path the `dataset` CTE held the entire materialized filtered set, and the `effectiveVolumes` expand (`transactions_id in (select id from dataset)`) therefore aggregated over every matching transaction to return a single page — linear in account history, even for a selective (but busy) account, not just the documented account="world" case. Restructure the fenced shape so the MATERIALIZED fence and the page window are separate, nested CTEs: WITH dataset AS ( WITH filtered AS MATERIALIZED (<filter + PIT + keyset>) SELECT * FROM filtered ORDER BY id <dir> LIMIT pageSize+1 ) SELECT * FROM dataset LEFT JOIN <expands on "select id from dataset"> ... ORDER BY dataset.id <dir> The LIMIT stays outside the *materialized* CTE (so the planner still evaluates `filtered` once via the GIN BitmapOr and the page is a cheap top-N over it), but now lives inside the `dataset` CTE the expands reference — so expand work is bounded to the page again, restoring the implicit "dataset = rows returned" contract. The expand() helper drops its now-unused `materialized` parameter. Constraint: effectiveVolumes expand references "select id from dataset" — the page LIMIT must be inside that CTE or the expand scans the full filtered set Rejected: Keep LIMIT on the outermost select | expands then aggregate over the whole filtered set (the bug this fixes) Confidence: high Scope-risk: moderate Directive: do not move the page ORDER BY/LIMIT back to the outer select — it must stay inside the "dataset" CTE so expands see only the page
Backport of #1409 to
release/v2.4.Problem
A selective wallet-history
ListTransactionsattachedORDER BY id DESC LIMIT nto the same select as a JSONB@>filter, so the planner chose an abort-earlytransactions_id_descindex walk that scanned ~2.3M rows (16.1s) to return 16.Fix
Wrap the filtered dataset in a
MATERIALIZEDCTE and moveORDER BY id DESC LIMIT nto the outer select, so the planner uses the GINBitmapOr(~7.7ms, ~2,000×). The fence is conditional — applied only when the filter contains a "needle" containment predicate (account/source/destination/metadata) via the opt-inDatasetFencerinterface; unfiltered/range-only lists keep the historical shape. No schema change (GIN indexes already exist).Backport notes
main(PR perf(storage): conditional MATERIALIZED CTE fence for ListTransactions #1409, commit801a5619); the only auto-merge was context inpaginator.go.go build,go vet, and the fence integration tests (TestListTransactionsMaterializedFence,TestListTransactionsFencedPagination,TestListTransactionsFencedWithEffectiveVolumes,TestShouldFenceTransactionsDataset) all pass.See #1409 for full details and the tri-model review.