R2R provides a flexible, provider-agnostic approach to integrate with various vector databases for storing and retrieving vector embeddings.

Supported Providers

R2R currently supports the following vector database providers:

Configuring Vector Database Providers

To switch between vector database providers, update the vector_database section in your config.json file:

"vector_database": {
    "provider": "pgvector"
}

Change the provider value to "pgvector" or "local" to switch providers.

Provider Details

PGVector Implementation (Default)

The PGVectorDB class integrates with the PGVector library for storing and retrieving vector embeddings in a PostgreSQL database.

Key features:

  • Connects to a PostgreSQL database using provided connection details
  • Initializes a collection with the specified name and dimension
  • Supports upserting vector entries with associated metadata
  • Performs similarity search using PGVector’s query functionality
  • Allows filtered deletion of entries based on metadata
  • Retrieves unique values for specific metadata fields

To configure PGVector, set the following environment variables:

# note, `demo_vecs` collection below can overriden freely

export POSTGRES_USER=$YOUR_POSTGRES_USER
export POSTGRES_PASSWORD=$YOUR_POSTGRES_PASSWORD
export POSTGRES_HOST=$YOUR_POSTGRES_HOST
export POSTGRES_PORT=$YOUR_POSTGRES_PORT
export POSTGRES_DBNAME=$YOUR_POSTGRES_DBNAME
export POSTGRES_VECS_COLLECTION=demo_vecs

Local Implementation (SQLite)

The local SQLite implementation is intended for testing purposes only. It is not suitable for production environments or large-scale applications.

The LocalVectorDB class uses SQLite as the underlying storage for vector entries and their metadata.

Key features:

  • Initializes a SQLite database with a table for vector entries
  • Supports upserting vector entries with associated metadata
  • Performs similarity search using cosine similarity calculation
  • Allows filtered deletion of entries based on metadata
  • Retrieves unique values for specific metadata fields

To configure the local SQLite database, set the following environment variable:

export LOCAL_DB_PATH=/path/to/your/sqlite/database.db

Switching Providers

To switch between vector database providers:

  1. Update the vector_database section in your config.json file.
  2. Set the appropriate environment variables for the chosen provider.

Example configuration for PGVector (default):

"vector_database": {
    "provider": "pgvector",
    "collection_name": "my_vectors"
}

Make sure to include any additional provider-specific settings in your configuration file.

Conclusion

By following this guide, you can configure and switch between different vector database providers in R2R. The PGVector implementation is recommended for production use, offering better performance and scalability. The local SQLite option is available for testing and development purposes but should not be used in production environments.

For more information on customizing R2R, refer to the Customizing R2R documentation.