Skip to content

Support Extended vector index config for creation and query #2508

Description

@amorton

Vector indexes support additional features at creation and query time as outlined here....

https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/index/sai/VECTOR.md?plain=1#L18

We would like to add these to the API for Collections and Tables, and we need to handle availability across different platforms Astra, HCD, DSE, and Cassandra.

This is a parent tracking ticket, add sub tickets for the different work needed.

Settings

Copied from https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/index/sai/VECTOR.md?plain=1#L18

Index Creation

Option Default Valid Range Description
maximum_node_connections 16 1-512 Controls the maximum number of connections per node in the graph. The actual graph degree will be 2x this value. Higher values increase graph quality but also increase storage and query costs.
construction_beam_width 100 1-3200 Controls how many candidates to evaluate during graph construction. Higher values increase graph quality but also increase build time.
neighborhood_overflow 1.0 in memtable, 1.2 in compaction > 0 Controls graph pruning during construction. Higher values result in denser graphs.
alpha dimesion > 3 gets 1.2. Otherwise, 2.0 in memtable, 1.4 in compaction > 0 Controls how aggressively to explore the graph during search. Higher values increase recall at the cost of latency.
enable_hierarchy false true/false When true, enables hierarchical graph construction.
source_model OTHER enum (see below) Preset configurations optimized for specific vector embedding models.
similarity_function (from source_model) COSINE, DOT_PRODUCT, EUCLIDEAN Defines how vector similarity is computed.

Index Query

Option Default Valid Range Description
rerank_k value computed based on LIMIT, configured source_model, and number of vector graphs being searched ≤ 0 or > LIMIT (up to guardrail) The number of candidates to collect before reranking. Values ≤ 0 disable reranking. Higher values increase recall at the cost of latency. Subject to guardrail sai_ann_rerank_k_max_value.
use_pruning true true/false When enabled, allows the search to skip parts of the graph that are unlikely to contain good matches. Can improve latency by possibly reducing recall.

Approach

The general approach will be to have a vector_indexing (for DDL creation) member that is overloaded to be a string for pre-set / named configurations or an object to provide the configs directly. The pre-set values are just macros that the API will expand to set the index properties.

At query time, same idea.

Initial JSON design

[
	{
		"createCollection": {
			"name": "config_test",
			"options": {
				"indexing": {
					"deny": [
						"content"
					]
				},
				"vector": {
					"metric": "cosine",
					"source_model": "ada",
					"dimension": 1024,
					"service": {
						"provider": "openai",
						"modelName": "text-embedding-3-small"
					},
					"vectorIndexing": "small-high-recall",
				},
				"rerank": {
					"enabled": false
				}
			}
		}
	},
	{
		"createCollection": {
			"name": "config_test",
			"options": {
				"indexing": {
					"deny": [
						"content"
					]
				},
				"vector": {
					"metric": "cosine",
					"source_model": "ada",
					"dimension": 1024,
					"service": {
						"provider": "openai",
						"modelName": "text-embedding-3-small"
					},
					"vectorIndexing": {
						"maximum_node_connections": 32
					}
				},
				"rerank": {
					"enabled": false
				}
			}
		}
	},
	{
		"createVectorIndex": {
			"name": "multi_vector_openai_my_vector_small",
			"definition": {
				"column": "my_vector_small",
				"options": {
					"metric": "cosine",
					"sourceModel": "openai-v3-small",
					"vectorIndexing” : “small-high-recall"
				}
			}
		}
	},
	{
		"findMany": {
			"sort" : {
				"$vectorize" : "i like cheese"
			},
			"options": {
				"vectorIndexing" : {
					"rerank_k" : 101
				}
			}
		}
	},
	
]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions