Advanced RAG
Learn how to build and use advanced RAG techniques with R2R
R2R supports advanced Retrieval-Augmented Generation (RAG) techniques that can be easily configured at runtime. This flexibility allows you to experiment with different state of the art strategies and optimize your RAG pipeline for specific use cases. This cookbook will cover toggling between vanilla RAG, HyDE and RAG-Fusion..
Advanced RAG techniques are still a beta feature in R2R. They are not currently supported in agentic workflows and there may be limitations in observability and analytics when implementing them.
Are we missing an important RAG technique? If so, then please let us know at [email protected].
Supported Advanced RAG Techniques
R2R currently supports two advanced RAG techniques:
- HyDE (Hypothetical Document Embeddings): Enhances retrieval by generating and embedding hypothetical documents based on the query.
- RAG-Fusion: Improves retrieval quality by combining results from multiple search iterations.
Using Advanced RAG Techniques
You can specify which advanced RAG technique to use by setting the search_strategy
parameter in your vector search settings. Below is a comprehensive overview of techniques supported by R2R.
HyDE
What is HyDE?
HyDE is an innovative approach that supercharges dense retrieval, especially in zero-shot scenarios. Here’s how it works:
- Query Expansion: HyDE uses a Language Model to generate hypothetical answers or documents based on the user’s query.
- Enhanced Embedding: These hypothetical documents are embedded, creating a richer semantic search space.
- Similarity Search: The embeddings are used to find the most relevant actual documents in your database.
- Informed Generation: The retrieved documents and original query are used to generate the final response.
Implementation Diagram
The diagram which follows below illustrates the HyDE flow which fits neatly into the schema of our diagram above (note, the GraphRAG workflow is omitted for brevity):
Using HyDE in R2R
Python
CLI
RAG-Fusion
What is RAG-Fusion?
RAG-Fusion is an advanced technique that combines Retrieval-Augmented Generation (RAG) with Reciprocal Rank Fusion (RRF) to improve the quality and relevance of retrieved information. Here’s how it works:
- Query Expansion: The original query is used to generate multiple related queries, providing different perspectives on the user’s question.
- Multiple Retrievals: Each generated query is used to retrieve relevant documents from the database.
- Reciprocal Rank Fusion: The retrieved documents are re-ranked using the RRF algorithm, which combines the rankings from multiple retrieval attempts.
- Enhanced RAG: The re-ranked documents, along with the original and generated queries, are used to generate the final response.
This approach helps to capture a broader context and potentially more relevant information compared to traditional RAG.
Implementation Diagram
Here’s a diagram illustrating the RAG-Fusion workflow (again, we omit the GraphRAG pipeline for brevity):
Using RAG-Fusion in R2R
Python
CLI
Combining with Other Settings
You can readily combine these advanced techniques with other search and RAG settings:
Conclusion
By leveraging these advanced RAG techniques and customizing their underlying prompts, you can significantly enhance the quality and relevance of your retrieval and generation processes. Experiment with different strategies, settings, and prompt variations to find the optimal configuration for your specific use case. The flexibility of R2R allows you to iteratively improve your system’s performance and adapt to changing requirements.