This installation guide is for Full R2R. For solo developers or teams prototyping, we *highly recommend starting with R2R Light or the containerized full installation.

R2R Local System Installation

This guide will walk you through installing and running R2R on your local system without using Docker. This method allows for more customization and control over the R2R source code.

Local installation of R2R Core is challenging due to the numerous services it integrates. We strongly recommend using Docker to get started quickly.

If you choose to proceed with a local installation, be prepared to set up and configure the following services:

  1. Postgres with pgvector: A relational database with vector storage capabilities.
  2. Unstructured.io: A complex system for file ingestion.
  3. Hatchet: A RabbitMQ-based orchestration system.

Alternatively, you can use cloud versions of these services, but you’ll be responsible for enrolling in them and providing the necessary environment variables.

Each of these components has its own requirements, potential compatibility issues, and configuration complexities. Debugging issues in a local setup can be significantly more challenging than using a pre-configured Docker environment.

Unless you have a specific need for a local installation and are comfortable with advanced system configuration, we highly recommend using the Docker setup method for a smoother experience.

Prerequisites

Before starting, ensure you have the following installed and/or available in the cloud:

  • Python 3.12 or higher
  • pip (Python package manager)
  • Git
  • Postgres + pgvector
  • Unstructured file ingestion
  • Hatchet workflow orchestration

Install the R2R CLI & Python SDK

First, install the R2R CLI and Python SDK:

$pip install 'r2r[core ingestion-bundle hatchet]'

Environment Setup

R2R requires connections to various services. Set up the following environment variables based on your needs:

Refer to the documentation here for detailed information on LLM configuration inside R2R.

$ # Set cloud LLM settings
> export OPENAI_API_KEY=sk-...
> # export ANTHROPIC_API_KEY=...
> # ...

Note, cloud providers are optional as R2R can be run entirely locally. For more information on local installation, refer here.

R2R uses Hatchet for orchestration. When building R2R locally, you will need to either (a) register for Hatchet’s cloud service or (b) install it locally following their instructions, or (c) deploy the R2R docker and connect with the internally provisioned Hatchet service.

$ # Set cloud LLM settings
> export HATCHET_CLIENT_TOKEN=...

With R2R you can connect to your own instance of Postgres+pgvector or a remote cloud instance. Refer here for detailed documentation on configuring Postgres inside R2R.

$ # Set Postgres+pgvector settings
> export R2R_POSTGRES_USER=$YOUR_POSTGRES_USER
> export R2R_POSTGRES_PASSWORD=$YOUR_POSTGRES_PASSWORD
> export R2R_POSTGRES_HOST=$YOUR_POSTGRES_HOST
> export R2R_POSTGRES_PORT=$YOUR_POSTGRES_PORT
> export R2R_POSTGRES_DBNAME=$YOUR_POSTGRES_DBNAME
> export R2R_PROJECT_NAME=$YOUR_PROJECT_NAME # see note below

The R2R_PROJECT_NAME environment variable defines the tables within your Postgres database where the selected R2R project resides. If the specified tables do not exist then they will be created by R2R during initialization.

By default, R2R uses unstructured.io to handle file ingestion. Unstructured can be:

  1. Installed locally by following their documented instructions
  2. Connected to via the cloud

For cloud connections, set the following environment variable:

$# Set Unstructured API key for cloud usage
>export UNSTRUCTURED_API_KEY=your_api_key_here

Alternatively, R2R has its own lightweight end-to-end ingestion, which may be more appropriate in some situations. This can be specified in your server configuration by choosing r2r as your ingestion provider.

Running R2R

The full R2R installation does not use the default r2r.toml, instead it provides overrides through a pre-built custom configuration, full.toml.

After installing the CLI and setting up your environment, you can start R2R using the following command:

$# requires services for unstructured, hatchet, postgres
>r2r serve --config-name=full

For local LLM usage:

$r2r serve --config-name=full_local_llm

Python Development Mode

For those looking to develop R2R locally:

  1. Install Poetry: Follow instructions on the official Poetry website.

  2. Clone and install dependencies:

    $git clone [email protected]:SciPhi-AI/R2R.git
    >cd R2R/py
    >poetry install -E "core ingestion-bundle hatchet"
  3. Setup environment: Follow the steps listed in the Environment Setup section above. Additionally, you may introduce a local .env file to make development easier, and you can customize your local r2r.toml to suit your specific needs.

  4. Start your server:

$poetry run r2r serve --config-name=core

Next Steps

After successfully installing R2R:

  1. Verify Installation: Ensure all components are running correctly by accessing the R2R API at http://localhost:7272/v3/health.

  2. Quick Start: Follow our R2R Quickstart Guide to set up your first RAG application.

  3. In-Depth Tutorial: For a more comprehensive understanding, work through our R2R Walkthrough.

  4. Customize Your Setup: Configure R2R components with the Configuration Guide.

If you encounter any issues during installation or setup, please use our Discord community or GitHub repository to seek assistance.

Built with