Skip to content

feat(sql): lower IN subqueries into joins#2489

Open
Treasure520520 wants to merge 1 commit into
getdozer:mainfrom
Treasure520520:bounty/1659-in-subquery
Open

feat(sql): lower IN subqueries into joins#2489
Treasure520520 wants to merge 1 commit into
getdozer:mainfrom
Treasure520520:bounty/1659-in-subquery

Conversation

@Treasure520520
Copy link
Copy Markdown

@Treasure520520 Treasure520520 commented May 12, 2026

/claim #1659

Proposed Changes

This PR adds a scoped first implementation for the remaining IN (SELECT ...) streaming SQL path:

  • Detects single-column Expr::InSubquery predicates in the SQL pipeline builder.
  • Lowers lhs IN (SELECT rhs FROM ...) into the existing join pipeline by creating an inner join on lhs = subquery_alias.rhs.
  • Reuses the existing JoinProcessorFactory and current join ports instead of introducing a new processor.
  • Qualifies an unqualified outer identifier with the outer table/alias when available, so customer_id IN (...) can resolve against the left side of the generated join.
  • Rejects multi-column IN subqueries and NOT IN (SELECT ...) explicitly instead of silently building an unsupported selection expression.

Demo

Short demo asset: https://github.com/Treasure520520/dozer/releases/download/dozer-1659-demo-2489/dozer-1659-in-subquery-demo.gif

Proof

Added builder regression coverage:

  • test_in_subquery_is_lowered_into_pipeline_inputs checks that a query with orders.customer_id IN (SELECT allowed_customers.customer_id FROM allowed_customers) builds a pipeline that pulls both orders and allowed_customers as inputs and registers the requested output table.
  • test_in_subquery_rejects_multi_column_projection checks that multi-column subquery projections fail early with UnsupportedSqlError.

Validation

Passed locally:

PATH=/Users/treasure/Documents/Codex/2026-05-12/goal-100-paypal-puresenses2021-gmail-com/tools/rustup/toolchains/1.91.1-aarch64-apple-darwin/bin:$PATH rustfmt --check dozer-sql/src/builder/mod.rs dozer-sql/src/builder/tests.rs
git diff --check

Attempted but blocked by local macOS toolchain setup before crate compilation:

PATH=/Users/treasure/Documents/Codex/2026-05-12/goal-100-paypal-puresenses2021-gmail-com/tools/rustup/toolchains/1.91.1-aarch64-apple-darwin/bin:$PATH cargo test -p dozer-sql builder::tests::test_in_subquery --no-default-features

The failure occurs while linking dependency build scripts (libc, serde, proc-macro2, etc.) with cc, with exit status 69 and the message: You have not agreed to the Xcode license agreements. No project crate type-checking was reached on this machine.

Checklist

  • PR created against the default branch
  • Tests added for the new builder behavior
  • Formatting for touched files checked
  • Validation blockers documented

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 12, 2026

CLA assistant check
All committers have signed the CLA.

@Treasure520520 Treasure520520 force-pushed the bounty/1659-in-subquery branch from a1382b0 to d560869 Compare May 12, 2026 21:12
@Treasure520520
Copy link
Copy Markdown
Author

Adding the short demo asset requested by the Algora bounty guidelines:

Demo GIF: https://github.com/Treasure520520/dozer/releases/download/dozer-1659-demo-2489/dozer-1659-in-subquery-demo.gif

It summarizes the scoped IN (SELECT ...) builder lowering, the positive/negative regression coverage, and the current local validation status (rustfmt --check and git diff --check passed; focused cargo test remains blocked on this machine by the unsigned Xcode license before crate compilation).

The remaining external blocker I can see is the CLA status.

@Treasure520520
Copy link
Copy Markdown
Author

CLA is now signed. The demo GIF and validation notes are in the PR body, so this is ready for maintainer review from my side. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants