features: materialize oversized result sets into a table#563
Open
cportele wants to merge 1 commit into
Open
Conversation
When a result set exceeds the materialization cap it could not be inlined as a literal IN list, so it fell back to re-deriving the producer inline in every consuming sub-query - the same multi-table producer (joins, spatial and temporal filters) re-run once per consumer, dozens of times for a shared set. Materialize such a set once into an indexed, session-independent table and reference it from the consumers as IN (SELECT <value column> FROM <table>), so the producer runs a single time and each consumer only scans the table. The table-backed reference flows into both consumer and producer filters, so a chain of oversized sets references each other's tables. Gated on a dialect capability; dialects without it keep the inline re-evaluation fallback. Table names are unique per request (a per-instance token plus a per-call sequence), so concurrent requests and multiple provider instances on the same database never collide. The names created for a request are collected and dropped when its feature stream completes, on success or error.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Depends on #562
When a result set exceeds the materialization cap it could not be inlined as a literal IN list, so it fell back to re-deriving the producer inline in every consuming sub-query - the same multi-table producer (joins, spatial and temporal filters) re-run once per consumer, dozens of times for a shared set.
Materialize such a set once into an indexed, session-independent table and reference it from the consumers as
IN (SELECT <value column> FROM <table>), so the producer runs a single time and each consumer only scans the table. The table-backed reference flows into both consumer and producer filters, so a chain of oversized sets references each other's tables. Gated on a dialect capability; dialects without it keep the inline re-evaluation fallback.Table names are unique per request (a per-instance token plus a per-call sequence), so concurrent requests and multiple provider instances on the same database never collide. The names created for a request are collected and dropped when its feature stream completes, on success or error.