Vector Search

Vector search settings can be configured both server-side and at runtime. Runtime settings are passed as a dictionary to the search and RAG endpoints. You may refer to the search API documentation here for additional materials.

Example using the Python SDK:

1vector_search_settings = {
2 "use_vector_search": True,
3 "search_filters": {"document_type": {"$eq": "article"}},
4 "search_limit": 20,
5 "use_hybrid_search": True,
6 "selected_collection_ids": ["c3291abf-8a4e-5d9d-80fd-232ef6fd8526"]
7}
8
9response = client.search("query", vector_search_settings=vector_search_settings)

Configurable Parameters

VectorSearchSettings

  1. use_vector_search (bool): Whether to use vector search
  2. use_hybrid_search (bool): Whether to perform a hybrid search (combining vector and keyword search)
  3. filters (dict): Alias for filters
  4. search_filters (dict): Filters to apply to the vector search
  5. search_limit (int): Maximum number of results to return (1-1000)
  6. selected_collection_ids (list[UUID]): Collection Ids to search for
  7. index_measure (IndexMeasure): The distance measure to use for indexing (cosine_distance, l2_distance, or max_inner_product)
  8. include_values (bool): Whether to include search score values in the search results
  9. include_metadatas (bool): Whether to include element metadata in the search results
  10. probes (Optional[int]): Number of ivfflat index lists to query (default: 10)
  11. ef_search (Optional[int]): Size of the dynamic candidate list for HNSW index search (default: 40)
  12. hybrid_search_settings (Optional[HybridSearchSettings]): Settings for hybrid search

HybridSearchSettings

  1. full_text_weight (float): Weight to apply to full text search (default: 1.0)
  2. semantic_weight (float): Weight to apply to semantic search (default: 5.0)
  3. full_text_limit (int): Maximum number of results to return from full text search (default: 200)
  4. rrf_k (int): K-value for RRF (Rank Reciprocal Fusion) (default: 50)

Advanced Filtering

R2R supports complex filtering using PostgreSQL-based queries. Allowed operators include:

  • eq, neq: Equality and inequality
  • gt, gte, lt, lte: Greater than, greater than or equal, less than, less than or equal
  • like, ilike: Pattern matching (case-sensitive and case-insensitive)
  • in, nin: Inclusion and exclusion in a list of values

Example of advanced filtering:

1filters = {
2 "$and": [
3 {"publication_date": {"$gte": "2023-01-01"}},
4 {"author": {"$in": ["John Doe", "Jane Smith"]}},
5 {"category": {"$ilike": "%technology%"}}
6 ]
7}
8vector_search_settings["filters"] = filters