Local LLMs — The most advanced AI retrieval system. Containerized, Retrieval-Augmented Generation (RAG) with a RESTful API.

Introduction

To run R2R with default local LLM settings, execute r2r serve --docker --config-name=local_llm.

R2R supports RAG with local LLMs through the Ollama library. You may follow the instructions on their official website to install Ollama outside of the R2R Docker.

Preparing Local LLMs

Next, make sure that you have all the necessary LLMs installed:

$ # in a separate terminal
> ollama pull llama3.1
> ollama pull mxbai-embed-large
> ollama serve

These commands will need to be replaced with models specific to your configuration when deploying R2R with a customized configuration.

Configuration

R2R uses a TOML configuration file for managing settings, which you can read about here. For local setup, we’ll use the default local_llm configuration. This can be customized to your needs by setting up a standalone project.

Local Configuration Details

For more information on how to configure R2R, visit here.

Summary

The above steps are all you need to get RAG up and running with local LLMs in R2R. For detailed setup and basic functionality, refer back to the [R2R Quickstart]((/documentation/quickstart/introduction). For more advanced usage and customization options, refer to the basic configuration or join the R2R Discord community.

$	# in a separate terminal
>	ollama pull llama3.1
>	ollama pull mxbai-embed-large
>	ollama serve