Optimize brain-region filter queries with a materialized CTE#631
Open
DriesVerachtert wants to merge 1 commit into
Open
Optimize brain-region filter queries with a materialized CTE#631DriesVerachtert wants to merge 1 commit into
DriesVerachtert wants to merge 1 commit into
Conversation
89fc777 to
535cd13
Compare
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
535cd13 to
a81a220
Compare
When filtering by within_brain_region, the PostgreSQL planner previously chose to scan all public entities via the ix_entity_public_creation_date_id partial index (driven by the ORDER BY creation_date DESC + LIMIT), then join into the brain-region recursive CTE — resulting in O(all_public_entities × CTE_rows) comparisons. The fix introduces a MATERIALIZED CTE (candidate_artifacts) that pre-computes the set of matching entity IDs and their creation_date values before the outer query runs. Because a materialized CTE acts as an optimization fence, the planner cannot inline it and is forced to evaluate it first. When the primary sort is creation_date DESC (the default for all affected endpoints), the ORDER BY is redirected to reference candidate_artifacts.creation_date instead of entity.creation_date. This makes the planner start from the small pre-filtered CTE rather than from the full entity index. Changes: - filter_by_region now builds and returns the MATERIALIZED CTE alongside the updated query (was a bare recursive-CTE join before). - InBrainRegionQuery stores the returned CTE via PrivateAttr and exposes it as candidate_cte so callers can reference it after the filter is applied. - router_read_many always applies filter_model.sort() first, then replaces the ORDER BY with CTE-based columns only when the primary sort is creation_date DESC. openbraininstitute/prod-platform-architecture#216
a81a220 to
f3a9b4e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When filtering by within_brain_region, the PostgreSQL planner previously chose to scan all public entities via the ix_entity_public_creation_date_id partial index (driven by the ORDER BY creation_date DESC + LIMIT), then join into the brain-region recursive CTE — resulting in O(all_public_entities × CTE_rows) comparisons.
The fix introduces a MATERIALIZED CTE (candidate_artifacts) that pre-computes the set of matching entity IDs and their creation_date values before the outer query runs. Because a materialized CTE acts as an optimization fence, the planner cannot inline it and is forced to evaluate it first.
When the primary sort is creation_date DESC (the default for all affected endpoints), the ORDER BY is redirected to reference candidate_artifacts.creation_date instead of entity.creation_date. This makes the planner start from the small pre-filtered CTE rather than from the full entity index.
Changes:
https://github.com/openbraininstitute/prod-platform-architecture/issues/216