Hybrid Search
Learn how to implement and use hybrid search with R2R
Introduction
R2R’s hybrid search blends keyword-based full-text search with semantic vector search, delivering results that are both contextually relevant and precise. By unifying these approaches, hybrid search excels at handling complex queries where both exact terms and overall meaning matter.
Understanding Search Modes
R2R supports multiple search modes that can simplify or customize the configuration for you:
basic
: Primarily semantic search. Suitable for straightforward scenarios where semantic understanding is key, but you don’t need the additional context of keyword matching.advanced
: Combines semantic and full-text search by default, effectively enabling hybrid search with well-tuned default parameters. Ideal if you want the benefits of hybrid search without manual configuration.custom
: Allows you full control over the search settings, including toggling semantic and full-text search independently. Choose this if you want to fine-tune weights, limits, and other search behaviors.
When using advanced
mode, R2R automatically configures hybrid search for you. For custom
mode, you can directly set use_hybrid_search=True
or enable both use_semantic_search
and use_fulltext_search
to achieve a hybrid search setup.
How R2R Hybrid Search Works
-
Full-Text Search:
Leverages Postgres’sts_rank_cd
andwebsearch_to_tsquery
to find documents containing your keywords. -
Semantic Search:
Uses vector embeddings to locate documents contextually related to your query, even if they don’t share exact keywords. -
Reciprocal Rank Fusion (RRF):
Merges results from both full-text and semantic searches using a formula like:This ensures that documents relevant both semantically and by keyword ranking float to the top.
-
Result Ranking:
Orders the final set of results based on the combined RRF score, providing balanced, meaningful search outcomes.
Key Features
Full-Text Search
- Uses Postgres indexing and querying for quick, exact term matches.
- Great for retrieving documents where specific terminology is critical.
Semantic Search
- Embeds queries and documents into vector representations.
- Finds documents related to the query’s meaning, not just its wording.
Hybrid Integration
- By enabling both
use_fulltext_search
anduse_semantic_search
, or choosing theadvanced
mode, you get the best of both worlds. - RRF blends these results, ensuring that documents align with the query’s intent and exact terms where needed.
Configuration
Choosing a Search Mode:
-
basic
: Semantic-only. -
advanced
: Hybrid by default. -
custom
: Manually configure hybrid search.
For more details on runtime configuration and combining search_mode
with custom search_settings
, refer to the Search API documentation.
Best Practices
-
Optimize Database and Embeddings:
Ensure Postgres indexing and vector store configurations are optimal for performance. -
Adjust Weights and Limits:
Tweakfull_text_weight
,semantic_weight
, andrrf_k
values when usingcustom
mode. If you’re usingadvanced
mode, the defaults are already tuned for general use cases. -
Regular Updates:
Keep embeddings and indexes up-to-date to maintain search quality. -
Choose Appropriate Embeddings:
Select an embedding model that fits your content domain for the best semantic results.
Conclusion
R2R’s hybrid search delivers robust, context-aware retrieval by merging semantic and keyword-driven approaches. Whether you pick basic
mode for simplicity, advanced
mode for out-of-the-box hybrid search, or custom
mode for granular control, R2R ensures you can tailor the search experience to your unique needs.