Create Vector Index
Create a new vector similarity search index in over the target table. Allowed tables include ‘vectors’, ‘entity’, ‘document_collections’. Vectors correspond to the chunks of text that are indexed for similarity search, whereas entity and document_collections are created during knowledge graph construction.
This endpoint creates a database index optimized for efficient similarity search over vector embeddings. It supports two main indexing methods:
-
HNSW (Hierarchical Navigable Small World):
- Best for: High-dimensional vectors requiring fast approximate nearest neighbor search
- Pros: Very fast search, good recall, memory-resident for speed
- Cons: Slower index construction, more memory usage
- Key parameters:
- m: Number of connections per layer (higher = better recall but more memory)
- ef_construction: Build-time search width (higher = better recall but slower build)
- ef: Query-time search width (higher = better recall but slower search)
-
IVF-Flat (Inverted File with Flat Storage):
- Best for: Balance between build speed, search speed, and recall
- Pros: Faster index construction, less memory usage
- Cons: Slightly slower search than HNSW
- Key parameters:
- lists: Number of clusters (usually sqrt(n) where n is number of vectors)
- probe: Number of nearest clusters to search
Supported similarity measures:
- cosine_distance: Best for comparing semantic similarity
- l2_distance: Best for comparing absolute distances
- ip_distance: Best for comparing raw dot products
Notes:
- Index creation can be resource-intensive for large datasets
- Use run_with_orchestration=True for large indices to prevent timeouts
- The ‘concurrently’ option allows other operations while building
- Index names must be unique per table
Headers
Bearer authentication of the form Bearer <token>, where token is your auth token.
Request
Whether to run index creation as an orchestrated task (recommended for large indices)
Response
Successful Response