Support Extended vector index config for creation and query

Vector indexes support additional features at creation and query time as outlined here....

https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/index/sai/VECTOR.md?plain=1#L18

We would like to add these to the API for Collections and Tables, and we need to handle availability across different platforms Astra, HCD, DSE, and Cassandra.

This is a parent tracking ticket, add sub tickets for the different work needed. 

## Settings

Copied from https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/index/sai/VECTOR.md?plain=1#L18

### Index Creation 

| Option                   | Default                                                              | Valid Range                               | Description                                                                                                                                                                                      |
|--------------------------|----------------------------------------------------------------------|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| maximum_node_connections | 16                                                                   | 1-512                                     | Controls the maximum number of connections per node in the graph. The actual graph degree will be 2x this value. Higher values increase graph quality but also increase storage and query costs. |
| construction_beam_width  | 100                                                                  | 1-3200                                    | Controls how many candidates to evaluate during graph construction. Higher values increase graph quality but also increase build time.                                                           |
| neighborhood_overflow    | 1.0 in memtable, 1.2 in compaction                                   | > 0                                       | Controls graph pruning during construction. Higher values result in denser graphs.                                                                                                               |
| alpha                    | dimesion > 3 gets 1.2. Otherwise, 2.0 in memtable, 1.4 in compaction | > 0                                       | Controls how aggressively to explore the graph during search. Higher values increase recall at the cost of latency.                                                                              |
| enable_hierarchy         | false                                                                | true/false                                | When true, enables hierarchical graph construction.                                                                                                                                              |
| source_model             | `OTHER`                                                              | enum (see [below](#vector-source-models)) | Preset configurations optimized for specific vector embedding models.                                                                                                                            |
| similarity_function      | (from `source_model`)                                                | `COSINE`, `DOT_PRODUCT`, `EUCLIDEAN`      | Defines how vector similarity is computed.                                                                                                                                                       |


### Index Query

| Option      | Default                                                                                                                       | Valid Range                         | Description                                                                                                                                                                                                                                                                         |
|-------------|-------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| rerank_k    | value computed based on `LIMIT`, configured [source_model](#vector-source-models), and number of vector graphs being searched | ≤ 0 or > LIMIT (up to guardrail) | The number of candidates to collect before reranking. Values ≤ 0 disable reranking. Higher values increase recall at the cost of latency. Subject to guardrail `sai_ann_rerank_k_max_value`. |
| use_pruning | true                                                                                                                          | true/false                          | When enabled, allows the search to skip parts of the graph that are unlikely to contain good matches. Can improve latency by possibly reducing recall.                                                                                                                              |


## Approach

The general approach will be to have a `vector_indexing` (for DDL creation) member that is overloaded to be a string for pre-set / named configurations or an object to provide the configs directly. The pre-set values are just macros that the API will expand to set the index properties. 

At query time, same idea. 

Initial JSON design 

```json
[
	{
		"createCollection": {
			"name": "config_test",
			"options": {
				"indexing": {
					"deny": [
						"content"
					]
				},
				"vector": {
					"metric": "cosine",
					"source_model": "ada",
					"dimension": 1024,
					"service": {
						"provider": "openai",
						"modelName": "text-embedding-3-small"
					},
					"vectorIndexing": "small-high-recall",
				},
				"rerank": {
					"enabled": false
				}
			}
		}
	},
	{
		"createCollection": {
			"name": "config_test",
			"options": {
				"indexing": {
					"deny": [
						"content"
					]
				},
				"vector": {
					"metric": "cosine",
					"source_model": "ada",
					"dimension": 1024,
					"service": {
						"provider": "openai",
						"modelName": "text-embedding-3-small"
					},
					"vectorIndexing": {
						"maximum_node_connections": 32
					}
				},
				"rerank": {
					"enabled": false
				}
			}
		}
	},
	{
		"createVectorIndex": {
			"name": "multi_vector_openai_my_vector_small",
			"definition": {
				"column": "my_vector_small",
				"options": {
					"metric": "cosine",
					"sourceModel": "openai-v3-small",
					"vectorIndexing” : “small-high-recall"
				}
			}
		}
	},
	{
		"findMany": {
			"sort" : {
				"$vectorize" : "i like cheese"
			},
			"options": {
				"vectorIndexing" : {
					"rerank_k" : 101
				}
			}
		}
	},
	
]
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Extended vector index config for creation and query #2508

Settings

Index Creation

Index Query

Approach

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Option	Default	Valid Range	Description
maximum_node_connections	16	1-512	Controls the maximum number of connections per node in the graph. The actual graph degree will be 2x this value. Higher values increase graph quality but also increase storage and query costs.
construction_beam_width	100	1-3200	Controls how many candidates to evaluate during graph construction. Higher values increase graph quality but also increase build time.
neighborhood_overflow	1.0 in memtable, 1.2 in compaction	> 0	Controls graph pruning during construction. Higher values result in denser graphs.
alpha	dimesion > 3 gets 1.2. Otherwise, 2.0 in memtable, 1.4 in compaction	> 0	Controls how aggressively to explore the graph during search. Higher values increase recall at the cost of latency.
enable_hierarchy	false	true/false	When true, enables hierarchical graph construction.
source_model	`OTHER`	enum (see below)	Preset configurations optimized for specific vector embedding models.
similarity_function	(from `source_model`)	`COSINE`, `DOT_PRODUCT`, `EUCLIDEAN`	Defines how vector similarity is computed.

Option	Default	Valid Range	Description
rerank_k	value computed based on `LIMIT`, configured source_model, and number of vector graphs being searched	≤ 0 or > LIMIT (up to guardrail)	The number of candidates to collect before reranking. Values ≤ 0 disable reranking. Higher values increase recall at the cost of latency. Subject to guardrail `sai_ann_rerank_k_max_value`.
use_pruning	true	true/false	When enabled, allows the search to skip parts of the graph that are unlikely to contain good matches. Can improve latency by possibly reducing recall.

Uh oh!

Support Extended vector index config for creation and query #2508

Description

Settings

Index Creation

Index Query

Approach

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions