refactor: move from 3-variant index search result to single generic variant#6902
refactor: move from 3-variant index search result to single generic variant#6902westonpace wants to merge 10 commits into
Conversation
|
@claude review once |
There was a problem hiding this comment.
An organization admin can view or raise the cap at claude.ai/admin-settings/claude-code. The cap resets at the start of the next billing period.
Once the cap resets or is raised, comment @claude review on this pull request to trigger a review.
f06ede2 to
3b4aa35
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Xuanwo
left a comment
There was a problem hiding this comment.
Mostly LGTM. We need to mark this PR as breaking changes.
| /// Result of an index search before NULL rows are dropped. Carries | ||
| /// three-valued-logic information via [`NullableRowAddrMask`]. | ||
| #[derive(Debug)] | ||
| pub enum NullableIndexExprResult { |
There was a problem hiding this comment.
Will this affect our running services while doing upgrading?
There was a problem hiding this comment.
Not anymore. I've updated it so that the wire format is still the same as the old wire format. It can't represent all masks though. We will need to choose the newer wire format at some point.
wjones127
left a comment
There was a problem hiding this comment.
Agreed it should be marked as breaking if serialization changes.
…r} interval
Replaces the `enum IndexExprResult { Exact(M), AtMost(M), AtLeast(M) }`
representation with a single struct carrying two row-address masks:
pub struct IndexExprResult {
pub lower: RowAddrMask, // rows definitely in the answer
pub upper: RowAddrMask, // rows that may be in the answer
} // (rows outside upper are definitely out)
Same change applied to `NullableIndexExprResult`. The three pre-existing
shapes map to degenerate intervals:
Exact(m) ≡ {lower: m, upper: m}
AtMost(m) ≡ {lower: allow_nothing(), upper: m}
AtLeast(m)≡ {lower: m, upper: all_rows()}
…and are now constructed via `IndexExprResult::{exact, at_most, at_least}`
associated fns / inspected via `is_exact()` / `is_at_most()` /
`is_at_least()` predicates. Intervals that are none of those — a
non-empty lower strictly inside a non-universe upper, the "Refined" case
— are now representable; previously the algebra had to either pick a
side (e.g. `AtMost(m) & AtLeast(_) → AtMost(m)`, dropping the lower
bound) or fail to produce one.
The boolean algebra collapses to elementwise lattice ops:
!{l, u} = {!u, !l}
{l1, u1} & {l2, u2} = {l1 & l2, u1 & u2}
{l1, u1} | {l2, u2} = {l1 | l2, u1 | u2}
This works for both `IndexExprResult` and `NullableIndexExprResult` —
the underlying `RowAddrMask` / `NullableRowAddrMask` types already
implement two-valued and SQL three-valued logic correctly inside each
mask, and lifting elementwise to the interval preserves both. NULL info
lives inside each `NullableRowAddrMask` (via its `nulls` field), and
`NullableRowAddrMask::Not` flips Allow↔Block without touching `nulls`,
so `!{l, u} = {!u, !l}` preserves NULL on both endpoints.
Wire format change: the (mask, discriminant) layout serialized by
`ScalarIndexExec` is now (lower, upper). Two binary columns instead of
one binary + one u32. The schema is internal to the in-process hand-off
from `ScalarIndexExec` to `FilteredReadExec`, so this isn't a
persisted-format break.
Other touchpoints:
* `apply_index_to_fragment` in filtered_read switches from a 3-arm
`match` to a predicate chain. The three current shapes preserve
their existing read-plan semantics; the Refined case is handled
conservatively (read `upper`, recheck via `full_filter`). When
per-range filter pushdown lands, the Refined arm will split
`to_read` into guaranteed-match ranges (skip recheck) and recheck
ranges (apply full filter).
* `MaterializeIndexExec` in scalar_index.rs uses `result.upper` for
the candidate-row set. `AtLeast` still bails (upper is universe);
everything else works.
* Per-fn `NullableRowAddrMask::allow_nothing()` and `all_rows()`
constructors added so the `at_most` / `at_least` builders have
short, obvious empty/universe sentinels.
* `NullableRowAddrMask` gains `PartialEq` so `is_exact()` can check
`lower == upper`.
* `IndexExprResult::{from_parts, serialize_to_arrow}` (inherent
methods on the old enum) → `index_expr_result_from_parts(lower,
upper)` / `serialize_index_expr_result(result, frags)` free
functions on `lance-index::scalar::expression`. The schema
`INDEX_EXPR_RESULT_SCHEMA` keeps the same name but its columns are
now (`lower`, `upper`, `fragments_covered`).
* Tests in expression.rs rewritten to use the new constructors and
`is_at_most()` / `is_at_least()` predicates. All assertions still
hold under the new algebra (verified by the bench: every
cross-variant combination tested lands in the same degenerate
corner of the lattice as before).
Bench deltas at N=10M rows. Both runs at criterion
`--measurement-time 3 --warm-up-time 2 --sample-size 30`, same workload
(single contiguous run, AllowList with no nulls), same machine, idle
CPU. Median time per op:
Op Variant pair Baseline 2-mask Delta
not Exact 8 ns 13 ns +60%
not AtMost / AtLeast 8 ns 14 ns +75%
and Exact_Exact 7.24 µs 13.84 µs +91%
and AtMost_AtMost 7.49 µs 7.32 µs -2%
and AtLeast_AtLeast 7.22 µs 7.35 µs +2%
and Exact_AtMost 7.15 µs 10.16 µs +42%
and Exact_AtLeast 1.63 µs 7.19 µs +341% (was a drop)
and AtMost_AtLeast 1.66 µs 1.76 µs +6%
or Exact_Exact 3.82 µs 7.54 µs +97%
or AtMost_AtMost 3.90 µs 3.87 µs -1%
or AtLeast_AtLeast 3.82 µs 4.00 µs +5%
or Exact_AtMost 3.67 µs 3.75 µs +2%
or Exact_AtLeast 3.74 µs 12.49 µs +234% (was a drop)
or AtMost_AtLeast 3.04 µs 11.70 µs +285% (was a drop)
Reading the deltas:
* NOT: a few extra ns (one mask flip → two endpoint flips + swap).
Irrelevant in absolute terms.
* Same-variant `AtMost_AtMost` / `AtLeast_AtLeast`: ~unchanged. The
second mask op composes empty-with-empty or universe-with-universe
and short-circuits.
* `Exact_Exact`: ~1.9-2× slower. Both endpoints carry the full mask;
we genuinely do two mask ops instead of one. Expected.
* `Exact_AtMost` AND: ~1.4× slower. New result still lands in AtMost
(lower stays empty after `a & empty`), but the upper-side `a & b`
is unchanged from before — the +42% is the second op + the empty
intersection cost.
* Cases that the old algebra resolved by dropping a bound
(`Exact_AtLeast` for both, `AtMost_AtLeast` for OR) are 3-4×
slower in absolute terms. The old code returned without doing a
mask op; the new code computes both endpoints and preserves both
bounds. The follow-up optimization here is to add
`BlockList(empty)`/`AllowList(empty)` fast paths to
`NullableRowAddrMask::{BitAnd, BitOr}`, which would catch the
universe/empty operands directly.
Trade-off: a few extra microseconds per binary op on 10M-row masks (no
impact on query latency, which is millisecond-scale and dominated by
I/O) in exchange for richer interval results. The Refined case unlocks
the actual `IS NOT NULL` acceleration we couldn't reach before — a
zone-map IsNotNull search can now produce
`{lower: definitely_non_null_zones, upper: not_all_null_zones}` and
the read planner can both skip all-null zones (the I/O win) and skip
the recheck on the guaranteed-non-null zones.
Verified:
* lance-select --lib: 97 passed
* lance-index --lib: 302 passed
* lance-table --lib: 104 passed
* lance --lib io::exec::{filtered_read,scalar_index}: 25 passed
* `cargo clippy --workspace --tests -- -D warnings`: clean
* `cargo fmt --all --check`: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onal-algebra v1
Adds a backwards-compat path for the `ScalarIndexExec` → `FilteredReadExec`
hand-off so the prior commit's `(lower, upper)` schema change doesn't
break older read planners. Introduces `LANCE_RELATIONAL_ALGEBRA_VERSION`
(currently `1`) as the knob that picks between legacy and new wire
formats; while it stays at 1, `ScalarIndexExec` emits and advertises the
pre-refactor `{result, discriminant, fragments_covered}` shape.
Pieces:
* `lance-index::scalar::expression`
- `LEGACY_INDEX_EXPR_RESULT_SCHEMA`: the legacy 3-shape schema kept
alongside the new `{lower, upper, fragments_covered}` one.
- `legacy_serialize_index_expr_result`: writes the legacy layout
from an `IndexExprResult`. Refined intervals (lower strictly
inside a non-universe upper) can't be encoded in the 3-shape
format, so they `tracing::warn!` and degrade to `AtMost(upper)`
— `upper` is already a valid superset and `AtMost` signals the
consumer to recheck.
- `deserialize_index_expr_result`: centralized deserialization
that handles both legacy and new schemas based on the first
column name (`result` vs `lower`).
- New-format `serialize_index_expr_result` now omits `upper` when
`result.is_exact()` by writing two nulls. `RowAddrMask::into_arrow`
always produces exactly one non-null row, so a fully-null `upper`
can't collide with any real mask; the deserializer reuses `lower`
in that case. Saves a full mask payload on every exact result.
* `lance::io::exec` / `scalar_index.rs`
- `LANCE_RELATIONAL_ALGEBRA_VERSION` constant.
- Local `serialize_index_expr_result` wrapper that dispatches to
legacy vs new based on the version.
- Local `INDEX_EXPR_RESULT_SCHEMA` `LazyLock` that resolves to the
legacy schema under v1 and the new schema after. Used by
`ScalarIndexExec` for `PlanProperties`, `schema()`,
`partition_statistics`, and the stream adapter so the plan's
advertised schema and the emitted batch agree.
- Fixed a now-dead-code bug in the prior `legacy_serialize_index_expr_result`
draft that built the `RecordBatch` with the new schema while
supplying legacy `(Binary, UInt32, Binary)` columns — would have
failed `RecordBatch::try_new` on every call.
* `lance::dataset::scanner` / `lance::io::exec::filtered_read`
- `scanner::u64s_as_take_input` switched to the version-aware
schema wrapper so the take-source stream's advertised schema
matches what the wrapper actually emits.
- `EvaluatedIndex::try_from_arrow` now calls the centralized
`deserialize_index_expr_result` instead of re-doing the
column-by-column decoding inline.
Tests added (4):
* `scalar::expression::test_serialize_index_expr_result_round_trip`
— round-trips `exact`/`at_most`/`at_least` through serialize +
deserialize and asserts both endpoints and the fragments bitmap
survive.
* `scalar::expression::test_serialize_omits_upper_when_exact` —
pins the wire-format invariant: exact ⇒ upper fully null, at_most
⇒ upper carries payload, at_least (upper = all_rows) still
encodes a non-null row 0 and round-trips with `is_at_least()`.
* `scalar::expression::test_legacy_serialize_refined_degrades_to_at_most`
— constructs a refined `IndexExprResult` and asserts the legacy
serializer doesn't error: the round-tripped result is `AtMost`
carrying the original `upper`.
* `io::exec::scalar_index::test_scalar_index_exec_returns_legacy_format`
+ `test_scalar_index_exec_advertises_legacy_schema` — runs
`ScalarIndexExec` end-to-end against a BTree-indexed dataset and
asserts the emitted batch, `plan.schema()`,
`plan.partition_statistics(None)`, and the stream's advertised
schema all agree on the legacy schema.
Verified:
* lance-index --lib scalar::expression::tests: 11 passed
* lance --lib io::exec::scalar_index: 4 passed
* lance --lib io::exec::filtered_read: 24 passed
* `cargo clippy -p lance -p lance-index --tests -- -D warnings`: clean
* `cargo fmt --all --check`: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c052536 to
eb3be84
Compare
Replace the O(n) `self.lower == self.upper` mask comparison in `IndexExprResult::is_exact` and `NullableIndexExprResult::is_exact` with a private `exact: bool` field set at construction time. The flag is propagated elementwise through Not/BitAnd/BitOr and across `drop_nulls`, so it stays correct without ever doing a full mask comparison. `IndexExprResult::deserialize` sets it by examining the wire format rather than comparing masks. Add `IndexExprResult::new(lower, upper)` for constructing refined intervals (lower strictly inside upper) where `exact = false` by definition; update the two test sites that used struct literal syntax directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o handle both wire formats SelectionVectorToPrefilter::load() was accessing batch["upper"] directly, which only exists in the TwoMask wire format. ScalarIndexExec uses the ThreeVariant format when LANCE_RELATIONAL_ALGEBRA_VERSION <= 1. The mismatch caused batch["upper"].unwrap() to panic inside a tokio::spawn task, leaving the AsyncCell unset and causing wait_for_ready() to hang forever. Fix by delegating to IndexExprResult::deserialize(), which already handles both TwoMask and ThreeVariant formats and returns the result's upper mask. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
What
Replaces the three-variant enum:
with a single struct carrying both bounds:
Same change applied to
NullableIndexExprResult(the during-evaluation formthat carries SQL 3VL info via
NullableRowAddrMask).The three pre-existing shapes map onto degenerate intervals:
Exact(m){lower: m, upper: m}AtMost(m){lower: allow_nothing(), upper: m}AtLeast(m){lower: m, upper: all_rows()}Constructed via
IndexExprResult::{exact, at_most, at_least}associatedfunctions; inspected via
is_exact()/is_at_most()/is_at_least()predicates.
Why
Some indices can return three-way info per row range — "definitely
matches", "maybe matches", "definitely doesn't match" — and the old
representation couldn't express it. The new form makes that fourth shape
representable as a non-degenerate interval (nonempty
lowerstrictlyinside non-universe
upper), and the boolean algebra collapses toelementwise lattice operations on the endpoints:
The motivating case: a zone-map
IS NOT NULLsearch can now produce{lower: zones_with_zero_nulls, upper: not_all_null_zones}. The readplanner can both skip the all-null zones entirely (I/O saving on the
50%-clustered-nulls case) and skip the recheck on the
guaranteed-non-null zones (CPU saving). The old algebra had to either
drop the lower bound (resulting in an
AtMostthat loses theall-null-skip) or drop the upper bound (resulting in an
AtLeastthatloses the recheck-skip).
3VL semantics are preserved: NULL info lives inside each
NullableRowAddrMask(in itsnullsfield), andNullableRowAddrMask::Notflips
Allow↔Blockwithout touchingnulls, so!{l, u} = {!u, !l}preserves NULL on both endpoints.
Wire format
The
INDEX_EXPR_RESULT_SCHEMAcolumns change from(result: Binary, discriminant: UInt32, fragments_covered: Binary)to(lower: Binary, upper: Binary, fragments_covered: Binary). The schemais internal to the in-process hand-off from
ScalarIndexExectoFilteredReadExec, so this isn't a persisted-format break.Other touchpoints
apply_index_to_fragmentinfiltered_read.rs: switches from a3-arm
matchto a predicate chain. Existing shapes keep theirexisting read-plan semantics. A non-degenerate Refined case is
handled conservatively for now (read
upper, recheck viafull_filter); a follow-up will splitto_readintoguaranteed-match ranges (skip recheck) and recheck ranges (apply
full filter) so the new representation actually pays off.
MaterializeIndexExecinscalar_index.rsusesresult.upperfor the candidate-row set.
AtLeaststill bails (upper isuniverse); everything else works.
NullableRowAddrMask::allow_nothing()/all_rows()constructors for the
at_most/at_leastbuilders.NullableRowAddrMaskgainsPartialEqsois_exact()can checklower == upper.IndexExprResult::{from_parts, serialize_to_arrow}(inherent methodson the old enum) →
index_expr_result_from_parts(lower, upper)/serialize_index_expr_result(result, frags)free functions onlance-index::scalar::expression.expression.rsrewritten touse the new constructors and
is_at_most()/is_at_least()predicates. All assertions still hold under the new algebra.
Performance
Bench from
lance-select/benches/index_expr_result.rsat N=10M rows.Both runs at criterion
--measurement-time 3 --warm-up-time 2 --sample-size 30, same workload (single contiguous run, AllowList withno nulls), same machine, idle CPU. Median time per op:
!!&&&&&&||||||Reading the deltas:
swap. Irrelevant in absolute terms.
AtMost_AtMost/AtLeast_AtLeast: roughlyunchanged. The second mask op composes empty-with-empty or
universe-with-universe and short-circuits in
RowAddrMask::BitAnd.Exact_Exact: ~1.9-2× slower. Both endpoints carry the full mask(the
exact()constructor clones), so we genuinely do two mask opsinstead of one. Expected.
(
Exact_AtLeastfor both ops,AtMost_AtLeastfor OR): 3-4× slowerin absolute terms. The old code returned without doing a mask op;
the new code computes both endpoints and preserves both bounds.
Follow-up optimization here is to add
BlockList(empty)/AllowList(empty)fast paths toNullableRowAddrMask::{BitAnd, BitOr}, which would catch the universe/empty operands directly.Trade-off: a few extra microseconds per binary op on 10M-row masks (no
impact on query latency — millisecond-scale and dominated by I/O) in
exchange for richer interval results. The Refined case unlocks
acceleration paths that weren't reachable before.
Verification
cargo test -p lance-select --lib: 99 passedcargo test -p lance-index --lib: 311 passedcargo test -p lance --lib io::exec::filtered_read: 23 passedcargo test -p lance --lib io::exec::scalar_index: 2 passedcargo clippy --workspace --tests -- -D warnings: cleancargo fmt --all --check: cleanFollow-ups
filtered_read::apply_index_to_fragmentso the Refined case can split
to_readinto definitely-match (skiprecheck) and recheck ranges (apply full filter) within one fragment.
That's what makes the new representation actually pay off for IS NOT
NULL.
NullableRowAddrMask::{BitAnd, BitOr}. DetectAllowList(empty_set)andBlockList(empty_set)operands and bypass the per-row work. Erases the regressions on
cross-variant cases.
IsNotNullquery that produces a Refined resultdirectly, instead of going through
Not(IsNull).🤖 Generated with Claude Code