Hybrid Search
Introduction
R2R’s hybrid search blends keyword-based full-text search with semantic vector search, delivering results that are both contextually relevant and precise. By unifying these approaches, hybrid search excels at handling complex queries where both exact terms and overall meaning matter.
How R2R Hybrid Search Works
Full-Text Search
Leverages Postgres’s ts_rank_cd and websearch_to_tsquery to find documents containing your keywords.
Semantic Search
Uses vector embeddings to locate documents contextually related to your query, even if they don’t share exact keywords.
Key Features
Full-Text Search
Semantic Search
Hybrid Integration
- Uses Postgres indexing and querying for quick, exact term matches.
- Great for retrieving documents where specific terminology is critical.
Understanding Search Modes
R2R supports multiple search modes that can simplify or customize the configuration for you:
basic: Primarily semantic search. Suitable for straightforward scenarios where semantic understanding is key, but you don’t need the additional context of keyword matching.advanced: Combines semantic and full-text search by default, effectively enabling hybrid search with well-tuned default parameters. Ideal if you want the benefits of hybrid search without manual configuration.custom: Allows you full control over the search settings, including toggling semantic and full-text search independently. Choose this if you want to fine-tune weights, limits, and other search behaviors.
When using advanced mode, R2R automatically configures hybrid search for you. For custom mode, you can directly set use_hybrid_search=True or enable both use_semantic_search and use_fulltext_search to achieve a hybrid search setup.
Configuration
Choosing a Search Mode:
-
basic: Semantic-only. -
advanced: Hybrid by default. -
custom: Manually configure hybrid search.
For more details on runtime configuration and combining search_mode with custom search_settings, refer to the Search API documentation.
Best Practices
-
Optimize Database and Embeddings:
Ensure Postgres indexing and vector store configurations are optimal for performance. -
Adjust Weights and Limits:
Tweakfull_text_weight,semantic_weight, andrrf_kvalues when usingcustommode. If you’re usingadvancedmode, the defaults are already tuned for general use cases. -
Regular Updates:
Keep embeddings and indexes up-to-date to maintain search quality. -
Choose Appropriate Embeddings:
Select an embedding model that fits your content domain for the best semantic results.
Conclusion
R2R’s hybrid search delivers robust, context-aware retrieval by merging semantic and keyword-driven approaches. Whether you pick basic mode for simplicity, advanced mode for out-of-the-box hybrid search, or custom mode for granular control, R2R ensures you can tailor the search experience to your unique needs.