What is the use case?
zvec is an embedded vector database from Alibaba, built on their production Proxima engine. No server, no daemon; just pip install zvec and point it at a local path. It supports dense and sparse vectors, HNSW indexing w/ int8 quantization, filtered hybrid search, and built-in multi-vector retrieval + reranking.
CocoIndex already has LanceDB as an embedded vector DB target; zvec fills a similar niche but brings a few things additionally: native sparse vector support, multi-vector retrieval w/ built-in reranking, and fine-grained resource governance (memory limits, CPU thread caps, mmap mode for datasets exceeding RAM).
Describe the solution you'd like
Since both zvec and LanceDB are embedded, path-based vector DBs, the LanceDB target connector is a natural implementation template? The mapping seems to be straightforward: path replaces db_uri, collection_name replaces table_name, zvec.open(path) serves as the shared handle, and collection.optimize() mirrors table.optimize() for periodic index compaction after incremental writes.
A Zvec target spec would need path, collection_name, and optionally num_transactions_before_optimize (same pattern as LanceDB) plus enable_mmap for large-dataset scenarios. For mutations, zvec supports insert, delete by ID, and delete_by_filter; there's no native upsert(), so it'd need a delete-then-insert pattern (similar to how some other connectors handle this). CocoIndex's Vector[Float32, N] maps directly to zvec's VECTOR_FP32 type; scalar metadata goes into Doc.scalars.
This would be target/sink only. zvec has no CDC, no change streaming, and no way to enumerate all documents incrementally; so it doesn't make sense as a source connector.
CC: @iaojnh @Cuiyus, @feihongxu0824: Hope you don't mind the ping; CC'ing you to keep you in the loop re: this integration.
❤️ Contributors, please refer to 📙Contributing Guide.
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like I'm working on it or Can I work on this issue? to avoid duplicating work. Our Discord server is always open and friendly.
What is the use case?
zvecis an embedded vector database from Alibaba, built on their productionProximaengine. No server, no daemon; justpip install zvecand point it at a local path. It supports dense and sparse vectors, HNSW indexing w/ int8 quantization, filtered hybrid search, and built-in multi-vector retrieval + reranking.CocoIndex already has
LanceDBas an embedded vector DB target;zvecfills a similar niche but brings a few things additionally: native sparse vector support, multi-vector retrieval w/ built-in reranking, and fine-grained resource governance (memory limits, CPU thread caps, mmap mode for datasets exceeding RAM).Describe the solution you'd like
Since both
zvecandLanceDBare embedded, path-based vector DBs, the LanceDB target connector is a natural implementation template? The mapping seems to be straightforward:pathreplacesdb_uri,collection_namereplacestable_name,zvec.open(path)serves as the shared handle, andcollection.optimize()mirrorstable.optimize()for periodic index compaction after incremental writes.A
Zvectarget spec would needpath,collection_name, and optionallynum_transactions_before_optimize(same pattern as LanceDB) plusenable_mmapfor large-dataset scenarios. For mutations,zvecsupportsinsert,deleteby ID, anddelete_by_filter; there's no nativeupsert(), so it'd need a delete-then-insert pattern (similar to how some other connectors handle this). CocoIndex'sVector[Float32, N]maps directly tozvec'sVECTOR_FP32type; scalar metadata goes intoDoc.scalars.This would be target/sink only.
zvechas no CDC, no change streaming, and no way to enumerate all documents incrementally; so it doesn't make sense as a source connector.CC: @iaojnh @Cuiyus, @feihongxu0824: Hope you don't mind the ping; CC'ing you to keep you in the loop re: this integration.
❤️ Contributors, please refer to 📙Contributing Guide.
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like
I'm working on itorCan I work on this issue?to avoid duplicating work. Our Discord server is always open and friendly.