Skip to content

feat(python): add shared RaBitQ rotation for distributed IVF_RQ builds#7014

Open
gstamatakis95 wants to merge 1 commit into
lance-format:mainfrom
gstamatakis95:feat/shared-rabitq-rotation-ivf-rq
Open

feat(python): add shared RaBitQ rotation for distributed IVF_RQ builds#7014
gstamatakis95 wants to merge 1 commit into
lance-format:mainfrom
gstamatakis95:feat/shared-rabitq-rotation-ivf-rq

Conversation

@gstamatakis95
Copy link
Copy Markdown

@gstamatakis95 gstamatakis95 commented May 31, 2026

Closes: #7012

What

Distributed IVF_RQ builds work in the Rust engine (#6359) but could not be driven from Python because the RaBitQ rotation could not be pinned across workers. Each per-fragment build generated its own random rotation, so segments rotated vectors differently, their binary codes were not comparable, and merging corrupted the index.

This adds a way to mint one rotation, broadcast it, and reuse it in every per-fragment build, mirroring how pq_codebook is injected.

Changes

  • Add build_rq_rotation(dimension, num_bits=1, rotation_type="fast", dtype="float32") that returns one rotation as a JSON string.
  • Add an rq_rotation parameter to create_index_uncommitted, parsed into a new transient RQBuildParams.rotation field and consumed by RabitQuantizer::build.
  • build() reuses the supplied rotation instead of generating a random one, after validating num_bits, code_dim, and the signs length.

Notes

  • Only the fast rotation is supported because its sign vector is JSON serializable.
  • The matrix rotation keeps a dense matrix in a binary buffer that the JSON wire format drops, so it is rejected with a clear error.
  • The params proto, the segment builder, and the merge and commit paths are unchanged.

Tests

  • Rust unit tests for shared-rotation reuse, identical codes across builds, mismatch and bad-input rejection, and the matrix-via-JSON rejection.
  • A Python integration test that builds two IVF_RQ segments on separate fragments with one shared rotation, merges, commits, and queries.

@github-actions github-actions Bot added enhancement New feature or request python labels May 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@gstamatakis95 gstamatakis95 changed the title feat(python): Add support for shared RaBitQ rotation for IVF_RQ feat(python): add shared RaBitQ rotation for distributed IVF_RQ builds May 31, 2026
@gstamatakis95 gstamatakis95 force-pushed the feat/shared-rabitq-rotation-ivf-rq branch from a8438c2 to a5bd8a8 Compare May 31, 2026 11:00
@gstamatakis95 gstamatakis95 marked this pull request as ready for review May 31, 2026 11:04
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Distributed IVF_RQ builds have no way of sharing the RaBitQ rotation across workers in Python

1 participant