Advanced RAG
Learn how to build and use advanced RAG techniques with R2R
Advanced RAG Techniques
R2R supports advanced Retrieval-Augmented Generation (RAG) techniques that can be easily configured at runtime. This flexibility allows you to experiment with different SoTA strategies and optimize your RAG pipeline for specific use cases. This cookbook will cover toggling between vanilla RAG, HyDE and RAG-Fusion..
Advanced RAG techniques are still a beta feature in R2R. They are not currently supported in agentic workflows and there may be limitations in observability and analytics when implementing them.
Are we missing an important RAG technique? If so, then please let us know at [email protected].
Advanced RAG in R2R
R2R is designed from the ground up to make it easy to implement advanced RAG techniques. Its modular architecture, based on orchestrated pipes and pipelines, allows for easy customization and extension. A generic implementation diagram of the system is shown below:
Supported Advanced RAG Techniques
R2R currently supports two advanced RAG techniques:
- HyDE (Hypothetical Document Embeddings): Enhances retrieval by generating and embedding hypothetical documents based on the query.
- RAG-Fusion: Improves retrieval quality by combining results from multiple search iterations.
Using Advanced RAG Techniques
You can specify which advanced RAG technique to use by setting the search_strategy
parameter in your vector search settings. Below is a comprehensive overview of techniques supported by R2R.
HyDE
What is HyDE?
HyDE is an innovative approach that supercharges dense retrieval, especially in zero-shot scenarios. Here’s how it works:
- Query Expansion: HyDE uses a Language Model to generate hypothetical answers or documents based on the user’s query.
- Enhanced Embedding: These hypothetical documents are embedded, creating a richer semantic search space.
- Similarity Search: The embeddings are used to find the most relevant actual documents in your database.
- Informed Generation: The retrieved documents and original query are used to generate the final response.
Implementation Diagram
The diagram which follows below illustrates the HyDE flow which fits neatly into the schema of our diagram above (note, the GraphRAG workflow is omitted for brevity):
Using HyDE in R2R
Python
CLI
RAG-Fusion
What is RAG-Fusion?
RAG-Fusion is an advanced technique that combines Retrieval-Augmented Generation (RAG) with Reciprocal Rank Fusion (RRF) to improve the quality and relevance of retrieved information. Here’s how it works:
- Query Expansion: The original query is used to generate multiple related queries, providing different perspectives on the user’s question.
- Multiple Retrievals: Each generated query is used to retrieve relevant documents from the database.
- Reciprocal Rank Fusion: The retrieved documents are re-ranked using the RRF algorithm, which combines the rankings from multiple retrieval attempts.
- Enhanced RAG: The re-ranked documents, along with the original and generated queries, are used to generate the final response.
This approach helps to capture a broader context and potentially more relevant information compared to traditional RAG.
Implementation Diagram
Here’s a diagram illustrating the RAG-Fusion workflow (again, we omit the GraphRAG pipeline for brevity):
Using RAG-Fusion in R2R
Python
CLI
Combining with Other Settings
You can readily combine these advanced techniques with other search and RAG settings:
Customization and Server-Side Defaults
While R2R allows for runtime configuration of these advanced techniques, it’s worth noting that server-side defaults can also be modified for consistent behavior across your application. This includes the ability to update and customize prompts used for techniques like HyDE and RAG-Fusion.
- For general configuration options, refer to the R2R configuration documentation.
- To learn about customizing prompts, including those used for HyDE and RAG-Fusion, see the prompt configuration documentation.
Prompts play a crucial role in shaping the behavior of these advanced RAG techniques. By modifying the HyDE and RAG-Fusion prompts, you can fine-tune the query expansion and retrieval processes to better suit your specific use case or domain.
Currently, these advanced techniques use a hard-coded multi-search configuration in the MultiSearchPipe
:
This configuration will be made user-configurable in the near future, allowing for even greater flexibility in customizing these advanced RAG techniques.
Conclusion
By leveraging these advanced RAG techniques and customizing their underlying prompts, you can significantly enhance the quality and relevance of your retrieval and generation processes. Experiment with different strategies, settings, and prompt variations to find the optimal configuration for your specific use case. The flexibility of R2R allows you to iteratively improve your system’s performance and adapt to changing requirements.