Multiple LLMs
Learn how to use different language models with R2R
Introduction
This guide extends the R2R Quickstart by demonstrating how R2R supports multiple large language models (LLMs). Multi-LLM support in R2R allows for a diverse and comprehensive approach to search and retrieval tasks.
LLMs are selected at runtime for maximum flexibility and ease of use.
Setup
This guide assumes R2R is already installed and the basic quickstart has been completed.
Using Different LLM Providers
Sample Commands
If you haven’t completed the quickstart or if your target database is empty, start by ingesting sample files:
# export OPENAI_API_KEY=...
python -m r2r.examples.quickstart ingest_files --no-media=true
Now we are ready to test RAG with different LLM providers and/or models.
Sample Code
The selected LLM in the commands above is propagated to the R2R rag method as part of the GenerationConfig
supplied via the rag_generation_config
argument. A simplified example of this logic can be seen here:
from r2r import VectorSearchSettings, GenerationConfig
vector_search_settings = VectorSearchSettings(
search_filters={"user_id": user1_id},
...
)
rag_generation_config = GenerationConfig(
model="claude-3-haiku-20240307",
temperature=0.2,
...
)
rag_results = app.rag(
query="Explain AI briefly",
vector_search_settings=vector_search_settings,
rag_generation_config=rag_generation_config
)
Refer to the LLM Deep Dive for more information on how R2R supports different LLM providers.
Summary
This guide demonstrates R2R’s flexibility in using multiple LLMs. By leveraging different models from providers like OpenAI, Anthropic, and local options like Ollama, you have full control over how to serve user responses. This allows you to optimize for performance, cost, or specific use case requirements in your RAG applications.
For detailed setup and basic functionality, refer back to the R2R Quickstart. For more advanced usage and customization options, join the R2R Discord community.
Was this page helpful?