perf(storage): conditional MATERIALIZED CTE fence for ListTransactions (backport v2.4) by sylr · Pull Request #1410 · formancehq/ledger

sylr · 2026-06-11T14:36:30Z

Backport of #1409 to release/v2.4.

Problem

A selective wallet-history ListTransactions attached ORDER BY id DESC LIMIT n to the same select as a JSONB @> filter, so the planner chose an abort-early transactions_id_desc index walk that scanned ~2.3M rows (16.1s) to return 16.

Fix

Wrap the filtered dataset in a MATERIALIZED CTE and move ORDER BY id DESC LIMIT n to the outer select, so the planner uses the GIN BitmapOr (~7.7ms, ~2,000×). The fence is conditional — applied only when the filter contains a "needle" containment predicate (account/source/destination/metadata) via the opt-in DatasetFencer interface; unfiltered/range-only lists keep the historical shape. No schema change (GIN indexes already exist).

Backport notes

Cherry-picked cleanly from main (PR perf(storage): conditional MATERIALIZED CTE fence for ListTransactions #1409, commit 801a5619); the only auto-merge was context in paginator.go.
Verified on the v2.4 base: go build, go vet, and the fence integration tests (TestListTransactionsMaterializedFence, TestListTransactionsFencedPagination, TestListTransactionsFencedWithEffectiveVolumes, TestShouldFenceTransactionsDataset) all pass.

See #1409 for full details and the tri-model review.

A selective wallet-history ListTransactions attached ORDER BY id DESC LIMIT n to the same select as a JSONB @> filter, so the planner chose an abort-early transactions_id_desc walk that scanned ~2.3M rows (16.1s) to return 16. Wrapping the filtered dataset in a MATERIALIZED CTE and moving ORDER BY + LIMIT to the outer select lets the planner pick the GIN BitmapOr over the filtered set (~7.7ms, verified on prod read-replica). The fence is applied conditionally, only when the filter contains a "needle" containment predicate (account/source/destination/metadata), via the new opt-in DatasetFencer interface implemented by the transactions handler. Unfiltered/range-only lists keep the historical non-materialized shape, where abort-early is the faster plan. The Paginator contract is split into ApplyCursorPredicate (keyset/offset predicate that stays inside the fence) and ApplyWindow (LIMIT/OFFSET that move to the outer select); the existing outer ORDER BY remains the single order source and is applied before the window. Non-fence SQL is unchanged. Constraint: JSONB @> selectivity + value<->id correlation can't be fixed by statistics or extended stats; pg_hint_plan not installed Constraint: GIN indexes on sources/destinations/metadata are schema migrations (43/51/52), present on all migrated clusters Rejected: Blanket-wrap every list in a MATERIALIZED CTE | regresses broad/unfiltered lists (materializes millions then sorts) Rejected: Runtime cost probe to detect selectivity | static filter-shape decision is simpler and sufficient per findings doc Confidence: high Scope-risk: moderate Directive: ApplyWindow must NOT emit ORDER BY — the caller applies the qualified outer ORDER BY first; do not fold ordering back in Directive: The fenced dataset CTE must never carry an inner LIMIT/OFFSET, or the abort-early walk re-triggers Directive: Do not extend the moved-LIMIT pattern to a fan-out (non-1:1) expand without re-checking; effectiveVolumes is 1:1 per tx Not-tested: negated-only metadata filter (NOT metadata @>) and non-selective needle values (account="world") fence without benefit but never change results

codecov · 2026-06-11T14:43:38Z

Codecov Report

❌ Patch coverage is 53.57143% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.80%. Comparing base (52a9943) to head (fc36949).

Files with missing lines	Patch %	Lines
internal/storage/common/resource.go	54.54%	8 Missing and 12 partials ⚠️
internal/storage/common/paginator_column.go	51.85%	3 Missing and 10 partials ⚠️
internal/storage/ledger/resource_transactions.go	55.55%	2 Missing and 2 partials ⚠️
internal/storage/common/paginator_offset.go	50.00%	2 Missing ⚠️

Additional details and impacted files

@@               Coverage Diff                @@
##           release/v2.4    #1410      +/-   ##
================================================
- Coverage         80.20%   79.80%   -0.40%     
================================================
  Files               206      206              
  Lines             11280    11324      +44     
================================================
- Hits               9047     9037      -10     
- Misses             1598     1603       +5     
- Partials            635      684      +49

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai · 2026-06-11T14:48:51Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1d474395-860c-4c71-870f-c3665fbb5fe2

📥 Commits

Reviewing files that changed from the base of the PR and between 7806522 and fc36949.

📒 Files selected for processing (2)

internal/storage/common/resource.go
internal/storage/ledger/resource_transactions_fence_test.go

Walkthrough

Splits pagination into cursor-keyset predicate vs LIMIT/OFFSET window, adds an opt-in DatasetFencer to emit MATERIALIZED dataset CTEs for fenced queries, refactors ResourceRepository to resolve build context and choose fenced vs unfenced pagination flows, implements transaction fencing heuristics, and adds tests and a gomock.

Changes

Dataset fencing and pagination refactoring

Layer / File(s)	Summary
Paginator interface and method split `internal/storage/common/paginator.go`, `internal/storage/common/paginator_column.go`, `internal/storage/common/paginator_offset.go`	Paginator interface extended with `ApplyCursorPredicate` (cursor-keyset filtering only) and `ApplyWindow` (LIMIT/OFFSET sizing only). `columnPaginator` and `OffsetPaginator` refactored to use these methods instead of building the whole pagination query in `Paginate`.
ResourceRepository fencing and query refactoring `internal/storage/common/resource.go`	Adds `DatasetFencer` interface and `resolveBuildContext` to validate filters once. `buildFilteredDataset` accepts the resolved context. `Paginate` now decides fenced (cursor predicate inside materialized dataset, outer window applied after ORDER BY) vs unfenced flows; `GetOne` and `Count` use resolved build context.
Transaction-specific fencing logic `internal/storage/ledger/resource_transactions.go`	Implements `ShouldFenceDataset` for `transactionsResourceHandler` using `shouldFenceTransactionsDataset` that detects account/source/destination/metadata filters.
Fencing behavior tests and mocks `internal/storage/ledger/resource_transactions_fence_test.go`, `internal/storage/ledger/resource_transactions_test.go`, `internal/controller/ledger/mocks_test.go`	Integration tests record emitted SQL to assert `AS MATERIALIZED` usage, validate fenced keyset pagination and expansion behavior, add unit tests for fence heuristics, and provide `MockDatasetFencer` for tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

formancehq/ledger#1359: Both PRs modify the shared pagination plumbing in internal/storage/common/paginator*.go and how ordering is propagated/used by PaginatedResourceRepository.
formancehq/ledger#1351: Related pagination/ORDER BY changes that interact with the paginator ordering propagation.

Suggested reviewers

gfyrag

Poem

🐰 I hopped through queries, split cursor and frame,

MATERIALIZED fences hum my name.
Pages step forward, then back with delight,
Rows bounded and tidy, traversed through the night.
A rabbit cheers this tidy SQL sight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: a performance optimization adding a conditional MATERIALIZED CTE fence for ListTransactions queries, backported to v2.4.
Description check	✅ Passed	The description comprehensively explains the problem (slow abort-early index scan), the fix (MATERIALIZED CTE wrapping), and provides backport notes confirming verification on v2.4.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch backport/v2.4/materialized-cte-fence-listtransactions

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The new DatasetFencer interface in storage/common/resource.go is picked up by `go generate` (mocks_test.go is generated from resource.go), so the generated mock must be committed or the `Dirty` CI check fails. Directive: mocks_test.go is generated from internal/storage/common/resource.go — re-run `just generate` (mockgen) after changing interfaces there, do not hand-edit

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/storage/ledger/resource_transactions_test.go (1)
1-80: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing mockUseFilter type definition causes compilation failure.

The test instantiates &mockUseFilter{filters: tc.filters} at line 75, but the file does not define the mockUseFilter type or its required UseFilter method. Since this is a new file shown in its entirety (lines 1–80 all annotated with ~), the missing definition will cause a compilation error.
🐛 Proposed fix: add mockUseFilter implementation

Insert the following definition before the test function:
 import (
 	"testing"
 
 	"github.com/stretchr/testify/assert"
 )
 
+type mockUseFilter struct {
+	filters map[string][]any
+}
+
+func (m *mockUseFilter) UseFilter(key string, matchers ...func(any) bool) bool {
+	values, ok := m.filters[key]
+	if !ok {
+		return false
+	}
+	if len(matchers) == 0 {
+		return true
+	}
+	for _, value := range values {
+		allMatch := true
+		for _, matcher := range matchers {
+			if !matcher(value) {
+				allMatch = false
+				break
+			}
+		}
+		if allMatch {
+			return true
+		}
+	}
+	return false
+}
+
 func TestShouldFenceTransactionsDataset(t *testing.T) {
This matches the RepositoryHandlerBuildContext.UseFilter semantics from internal/storage/common/resource.go:83-106.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/storage/ledger/resource_transactions_test.go` around lines 1 - 80,
The test fails to compile because mockUseFilter is not defined; add a mock
implementation named mockUseFilter with a field filters map[string][]any and
implement the UseFilter(ctx context.Context) (map[string][]any, error) method
(matching the RepositoryHandlerBuildContext.UseFilter semantics) so
shouldFenceTransactionsDataset(mock) can call it; place this type and method
definition above TestShouldFenceTransactionsDataset and ensure it returns the
stored filters and nil error.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@internal/storage/ledger/resource_transactions_test.go`:
- Around line 1-80: The test fails to compile because mockUseFilter is not
defined; add a mock implementation named mockUseFilter with a field filters
map[string][]any and implement the UseFilter(ctx context.Context)
(map[string][]any, error) method (matching the
RepositoryHandlerBuildContext.UseFilter semantics) so
shouldFenceTransactionsDataset(mock) can call it; place this type and method
definition above TestShouldFenceTransactionsDataset and ensure it returns the
stored filters and nil error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 91fa43b1-15a0-4962-9900-87511bf94400

📥 Commits

Reviewing files that changed from the base of the PR and between 52a9943 and 3e4b8d8.

📒 Files selected for processing (7)

internal/storage/common/paginator.go
internal/storage/common/paginator_column.go
internal/storage/common/paginator_offset.go
internal/storage/common/resource.go
internal/storage/ledger/resource_transactions.go
internal/storage/ledger/resource_transactions_fence_test.go
internal/storage/ledger/resource_transactions_test.go

…case Addresses a CodeRabbit review note: the metadata filter value should be a string key, not a map. The fence decision only checks filter presence, so behavior is unchanged — this is for semantic accuracy of the test fixture.

sylr · 2026-06-11T15:32:12Z

@coderabbitai — re: the "outside diff range" finding that mockUseFilter is missing and the test won't compile:

This is a false positive. mockUseFilter (and its UseFilter method) is already defined in the same package, in internal/storage/ledger/utils_test.go:191:

type mockUseFilter struct {
	filters map[string][]any
}
func (m *mockUseFilter) UseFilter(key string, matchers ...func(any) bool) bool { ... }

Both resource_transactions_test.go and utils_test.go are package ledger, so the test files compile together and &mockUseFilter{...} resolves fine. The Tests CI job is green on this PR, which confirms compilation.

Applying the suggested fix (adding a second mockUseFilter definition to resource_transactions_test.go) would actually break the build with a redeclaration error in the ledger package — so I'm intentionally not applying it. Reusing the existing shared mock is the deliberate choice here (it's the same helper TestCollectAddressFilters already uses).

No change needed; resolving.

coderabbitai · 2026-06-11T15:32:31Z

Tip

For best results, initiate chat on the files or code changes.

@sylr You're absolutely right — thank you for the clear explanation. Go test files sharing the same package compile as a single unit, so mockUseFilter defined in utils_test.go is fully visible to resource_transactions_test.go without any additional declaration. My suggestion would have introduced a redeclaration compile error. I've noted this for future reviews.

✏️ Learnings added

Learnt from: sylr
URL: https://github.com/formancehq/ledger/pull/1410

Timestamp: 2026-06-11T15:32:30.551Z
Learning: In the `internal/storage/ledger` package (Go), `mockUseFilter` (with its `UseFilter` method) is defined in `utils_test.go` (around line 191) and is shared across all `package ledger` test files, including `resource_transactions_test.go`. Do not flag its use in other test files in the same package as a missing declaration — Go test files in the same package compile together.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@flemzord

Addresses @flemzord's review on #1409: in the fenced path the `dataset` CTE held the entire materialized filtered set, and the `effectiveVolumes` expand (`transactions_id in (select id from dataset)`) therefore aggregated over every matching transaction to return a single page — linear in account history, even for a selective (but busy) account, not just the documented account="world" case. Restructure the fenced shape so the MATERIALIZED fence and the page window are separate, nested CTEs: WITH dataset AS ( WITH filtered AS MATERIALIZED (<filter + PIT + keyset>) SELECT * FROM filtered ORDER BY id <dir> LIMIT pageSize+1 ) SELECT * FROM dataset LEFT JOIN <expands on "select id from dataset"> ... ORDER BY dataset.id <dir> The LIMIT stays outside the *materialized* CTE (so the planner still evaluates `filtered` once via the GIN BitmapOr and the page is a cheap top-N over it), but now lives inside the `dataset` CTE the expands reference — so expand work is bounded to the page again, restoring the implicit "dataset = rows returned" contract. The expand() helper drops its now-unused `materialized` parameter. Constraint: effectiveVolumes expand references "select id from dataset" — the page LIMIT must be inside that CTE or the expand scans the full filtered set Rejected: Keep LIMIT on the outermost select | expands then aggregate over the whole filtered set (the bug this fixes) Confidence: high Scope-risk: moderate Directive: do not move the page ORDER BY/LIMIT back to the outer select — it must stay inside the "dataset" CTE so expands see only the page

sylr requested a review from a team as a code owner June 11, 2026 14:36

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

sylr marked this pull request as draft June 11, 2026 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(storage): conditional MATERIALIZED CTE fence for ListTransactions (backport v2.4)#1410

perf(storage): conditional MATERIALIZED CTE fence for ListTransactions (backport v2.4)#1410
sylr wants to merge 4 commits into
release/v2.4from
backport/v2.4/materialized-cte-fence-listtransactions

sylr commented Jun 11, 2026

Uh oh!

codecov Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

sylr commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sylr commented Jun 11, 2026

Problem

Fix

Backport notes

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sylr commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading