LLMs | The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

1 [app] 2 fast_llm = "openai/gpt-4o-mini" 3 quality_llm = "openai/gpt-4o" 4 reasoning_llm = "openai/o3-mini" 5 planning_llm = "anthropic/claude-3-7-sonnet-20250219" 6 7 [completion] 8 provider = "r2r" 9 concurrent_request_limit = 64 10 11 [completion.generation_config] 12 temperature = 0.1 13 top_p = 1 14 max_tokens_to_sample = 4_096 15 stream = false 16 add_generation_kwargs = { }

R2R uses Large Language Models (LLMs) as the core reasoning engine for RAG operations, providing sophisticated text generation and analysis capabilities.

R2R uses LiteLLM as to route LLM requests because of their provider flexibility. Read more about LiteLLM here.

LLM Configuration

The LLM system can be customized through the completion section in your r2r.toml file. Learn more about working with R2R config files.

For more detailed information on configuring other search and RAG settings at runtime, please refer to the RAG Configuration documentation.

Environment Variables

Provider dependent environment variables must be set. These may include:

1 OPENAI_API_KEY=…
2 ANTHROPIC_API_KEY=…
3 AZURE_API_KEY=…

R2R

The R2R LLM provider offers a robust gateway for OpenAI, Anthropic, and Azure Foundry models.

LiteLLM

LiteLLM offers a Python SDK to call 100+ LLM APIs in OpenAI format

OpenAI

The OpenAI LLM provider makes direct use of the OpenAI Python SDK.