Support FlatIndex with Vector Quantization

Any plans on supporting FlatIndex (or even HNSW) with quantized vectors? (int8 for example)

Distance algorithms are largely the same, FlatIndex can use generics to detect the vector type, or explicitly pass the quantization option.

Vector quantization is great for speed and memory savings, but sometimes you just need the best recall possible.

Providing automatic quantization on add would be also possible if allowed, but mainly given how fast SIMD accelerated int8 dot product gets, just being able to use FlatIndex over int8 vectors will be great for a LOT of use cases. Int8 vectors retain almost all of the semantic coherence of float32 while providing 4x memory savings and I bet speed up gains on FlatIndex and HNSW would be substantial (specially is AVX512 is available for SIMD).

I have taken it way futher on tests, and binary quantization on large dimensionality vectors (2560 - 4096 dimensions for example, using Qwen3 embed) has 0.1 recall difference vs float32, with 32x space savings and 60-120x speed increases vs DotProduct/CosineSimilarity (HammingDistance over uint64). That alone makes FlatIndex a contender vs other algorithms while keeping perfect recall.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support FlatIndex with Vector Quantization #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support FlatIndex with Vector Quantization #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions