Search and RAG
Search and retrieve information using vectors, text, and RAG
R2R provides powerful search and retrieval capabilities through vector search, full-text search, hybrid search, and Retrieval-Augmented Generation (RAG). The system supports multiple search modes and extensive runtime configuration to help you find and contextualize information effectively.
Refer to the retrieval API and SDK reference for detailed retrieval examples.
AI Powered Search
R2R offers powerful and highly configurable search capabilities, including vector search, hybrid search, and knowledge graph-enhanced search. These features allow for more accurate and contextually relevant information retrieval.
Vector Search
Vector search parameters inside of R2R can be fine-tuned at runtime for optimal results. Here’s how to perform a basic vector search:
Python
JavaScript
Curl
Example Output:
Key configurable parameters for vector search can be inferred from the retrieval API reference.
Hybrid Search
R2R supports hybrid search, which combines traditional keyword-based search with vector search for improved results. Here’s how to perform a hybrid search:
Python
JavaScript
Curl
Advanced Filtering
R2R allows you to apply filters to your search queries to narrow down results. Filters can be used to target specific documents, date ranges, or any metadata field:
Python
JavaScript
AI Retrieval (RAG)
R2R is built around a comprehensive Retrieval-Augmented Generation (RAG) engine, allowing you to generate contextually relevant responses based on your ingested documents. The RAG process combines all the search functionality shown above with Large Language Models to produce more accurate and informative answers.
Basic RAG
To generate a response using RAG, use the following command:
Python
JavaScript
Curl
Example Output:
RAG with Web Search Integration
R2R now supports web search integration, allowing you to enhance your RAG responses with up-to-date information from the web. To include web search in your RAG query, simply add the include_web_search
flag:
Python
JavaScript
Curl
When include_web_search
is set to true
, the system will perform a web search and include relevant results in the context provided to the LLM, enhancing the response with the most current information available online.
RAG with Hybrid Search
R2R also supports hybrid search in RAG, combining the power of vector search and keyword-based search. To use hybrid search in RAG, simply add the use_hybrid_search
flag to your search settings input:
Python
JavaScript
Curl
This example demonstrates how hybrid search can enhance the RAG process by combining semantic understanding with keyword matching, potentially providing more accurate and comprehensive results.
Streaming RAG
R2R also supports streaming RAG responses, which can be useful for real-time applications.
When using streaming RAG, you’ll receive different types of events:
SearchResultsEvent
- Contains the initial search results from your documentsMessageEvent
- Streams partial tokens of the response as they are generatedCitationEvent
- Indicates when a citation is added to the response, with relevant metadata including:id
- Unique identifier for the citationobject
- Always “citation”source_type
- The type of source (chunk, graph, web, etc.)source_title
- Title of the source document when available
FinalAnswerEvent
- Contains the complete generated answer and structured citationsThinkingEvent
- For reasoning agents, contains the model’s step-by-step reasoning process
The citations in the final response are structured objects that link specific passages in the response to their source documents, enabling proper attribution and verification. To use streaming RAG:
Python
JavaScript
Example Output:
Customizing RAG
R2R offers extensive customization options for its Retrieval-Augmented Generation (RAG) functionality:
-
Search Settings: Customize vector and knowledge graph search parameters using
VectorSearchSettings
andKGSearchSettings
. -
Generation Config: Fine-tune the language model’s behavior with
GenerationConfig
, including:- Temperature, top_p, top_k for controlling randomness
- Max tokens, model selection, and streaming options
- Advanced settings like beam search and sampling strategies
-
Web Search Integration: Enable web search to supplement your knowledge base with:
include_web_search
: Boolean flag to include web search results- Automatic merging of web results with your document corpus
-
Multiple LLM Support: Easily switch between different language models and providers:
- OpenAI models (default)
- Anthropic’s Claude models
- Local models via Ollama
- Any provider supported by LiteLLM
Example of customizing the model with web search:
Python
JavaScript
Curl
This flexibility allows you to optimize RAG performance for your specific use case and leverage the strengths of various LLM providers while incorporating the latest information from the web.
Streaming Agent (Deep Research Mode)
R2R offers a powerful agentic
retrieval mode that performs in-depth analysis of documents through iterative research and reasoning. This mode can replicate Deep Research-like results by leveraging a variety of tools to thoroughly investigate your data and the web:
Python
JavaScript
Example of streaming output:
Knowledge Graph Enhanced Search
R2R provides knowledge graph integration with its search capabilities, allowing for more contextually rich results by leveraging entity and relationship information.
Python
JavaScript
Curl
Behind the scenes, R2R’s RetrievalService handles RAG requests, combining the power of vector search, optional knowledge graph integration, web search integration, and language model generation.
Conclusion
R2R’s search and RAG capabilities provide flexible tools for finding and contextualizing information. Whether you need simple semantic search or complex hybrid retrieval with custom RAG generation and web search integration, the system can be configured to meet your specific needs.
For more advanced use cases:
- Explore knowledge-graphs to enhance your search with entity relationships
- Learn about hybrid search settings for fine-tuning your search parameters
- Check out more advanced RAG configurations to optimize responses