feat(sql): lower IN subqueries into joins#2489
Open
Treasure520520 wants to merge 1 commit into
Open
Conversation
a1382b0 to
d560869
Compare
Author
|
Adding the short demo asset requested by the Algora bounty guidelines: It summarizes the scoped The remaining external blocker I can see is the CLA status. |
Author
|
CLA is now signed. The demo GIF and validation notes are in the PR body, so this is ready for maintainer review from my side. Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
/claim #1659
Proposed Changes
This PR adds a scoped first implementation for the remaining
IN (SELECT ...)streaming SQL path:Expr::InSubquerypredicates in the SQL pipeline builder.lhs IN (SELECT rhs FROM ...)into the existing join pipeline by creating an inner join onlhs = subquery_alias.rhs.JoinProcessorFactoryand current join ports instead of introducing a new processor.customer_id IN (...)can resolve against the left side of the generated join.INsubqueries andNOT IN (SELECT ...)explicitly instead of silently building an unsupported selection expression.Demo
Short demo asset: https://github.com/Treasure520520/dozer/releases/download/dozer-1659-demo-2489/dozer-1659-in-subquery-demo.gif
Proof
Added builder regression coverage:
test_in_subquery_is_lowered_into_pipeline_inputschecks that a query withorders.customer_id IN (SELECT allowed_customers.customer_id FROM allowed_customers)builds a pipeline that pulls bothordersandallowed_customersas inputs and registers the requested output table.test_in_subquery_rejects_multi_column_projectionchecks that multi-column subquery projections fail early withUnsupportedSqlError.Validation
Passed locally:
PATH=/Users/treasure/Documents/Codex/2026-05-12/goal-100-paypal-puresenses2021-gmail-com/tools/rustup/toolchains/1.91.1-aarch64-apple-darwin/bin:$PATH rustfmt --check dozer-sql/src/builder/mod.rs dozer-sql/src/builder/tests.rs git diff --checkAttempted but blocked by local macOS toolchain setup before crate compilation:
The failure occurs while linking dependency build scripts (
libc,serde,proc-macro2, etc.) withcc, with exit status 69 and the message:You have not agreed to the Xcode license agreements. No project crate type-checking was reached on this machine.Checklist