Learn how to build and customize R2R

Introduction

R2R uses two key components for assembling and customizing applications: the R2RBuilder and a set of factory classes, R2RFactory*. These components work together to provide a flexible and intuitive way to construct R2R instances with custom configurations, providers, pipes, and pipelines.

R2RBuilder

The R2RBuilder is the central component for assembling R2R applications. It employs the Builder pattern for simple application customization.

Key Features

  1. Flexible Configuration: Supports loading configurations from TOML files or using predefined configurations.
  2. Component Overrides: Allows overriding default providers, pipes, and pipelines with custom implementations.
  3. Factory Customization: Supports custom factory implementations for providers, pipes, and pipelines.
  4. Fluent Interface: Provides a chainable API for easy and readable setup.

Basic Usage

Here’s a simple example of how to use the R2RBuilder:

1from r2r import R2RBuilder, R2RConfig
2
3# Create an R2R instance with default configuration
4r2r = R2RBuilder().build()
5
6# Create an R2R instance with a custom configuration file
7r2r = R2RBuilder(config=R2RConfig.from_toml("path/to/config.toml")).build()
8
9# Create an R2R instance with a predefined configuration
10r2r = R2RBuilder(config_name="full").build()

Factories

R2R uses a set of factory classes to create various components of the system. These factories allow for easy customization and extension of R2R’s functionality.

Main Factory Classes

  1. R2RProviderFactory: Creates provider instances (e.g., DatabaseProvider, EmbeddingProvider).
  2. R2RPipeFactory: Creates individual pipes used in pipelines.
  3. R2RPipelineFactory: Creates complete pipelines by assembling pipes.

Factory Methods

Each factory class contains methods for creating specific components. For example, the R2RPipeFactory includes methods like:

  • create_parsing_pipe()
  • create_embedding_pipe()
  • create_vector_search_pipe()
  • create_rag_pipe()

Customizing Factories

You can customize the behavior of R2R by extending these factory classes. Here’s a simplified example of extending the R2RPipeFactory:

1from r2r import R2RPipeFactory
2from r2r.pipes import MultiSearchPipe, QueryTransformPipe
3
4class R2RPipeFactoryWithMultiSearch(R2RPipeFactory):
5 def create_vector_search_pipe(self, *args, **kwargs):
6 # Create a custom multi-search pipe
7 query_transform_pipe = QueryTransformPipe(
8 llm_provider=self.providers.llm,
9 config=QueryTransformPipe.QueryTransformConfig(
10 name="multi_search",
11 task_prompt="multi_search_task_prompt",
12 ),
13 )
14
15 inner_search_pipe = super().create_vector_search_pipe(*args, **kwargs)
16
17 return MultiSearchPipe(
18 query_transform_pipe=query_transform_pipe,
19 inner_search_pipe=inner_search_pipe,
20 config=MultiSearchPipe.PipeConfig(),
21 )

Builder + Factory in action

The R2RBuilder provides methods to override various components of the R2R system:

Provider Overrides

1from r2r.providers import CustomAuthProvider, CustomDatabaseProvider
2
3builder = R2RBuilder()
4builder.with_auth_provider(CustomAuthProvider())
5builder.with_database_provider(CustomDatabaseProvider())
6r2r = builder.build()

Available provider override methods:

  • with_auth_provider
  • with_database_provider
  • with_embedding_provider
  • with_eval_provider
  • with_llm_provider
  • with_crypto_provider

Pipe Overrides

1from r2r.pipes import CustomParsingPipe, CustomEmbeddingPipe
2
3builder = R2RBuilder()
4builder.with_parsing_pipe(CustomParsingPipe())
5builder.with_embedding_pipe(CustomEmbeddingPipe())
6r2r = builder.build()

Available pipe override methods:

  • with_parsing_pipe
  • with_embedding_pipe
  • with_vector_storage_pipe
  • with_vector_search_pipe
  • with_rag_pipe
  • with_streaming_rag_pipe
  • with_eval_pipe
  • with_kg_pipe
  • with_kg_storage_pipe
  • with_kg_search_pipe

Pipeline Overrides

1from r2r.pipelines import CustomIngestionPipeline, CustomSearchPipeline
2
3builder = R2RBuilder()
4builder.with_ingestion_pipeline(CustomIngestionPipeline())
5builder.with_search_pipeline(CustomSearchPipeline())
6r2r = builder.build()

Available pipeline override methods:

  • with_ingestion_pipeline
  • with_search_pipeline
  • with_rag_pipeline
  • with_streaming_rag_pipeline
  • with_eval_pipeline

Factory Overrides

1from r2r.factory import CustomProviderFactory, CustomPipeFactory, CustomPipelineFactory
2
3builder = R2RBuilder()
4builder.with_provider_factory(CustomProviderFactory)
5builder.with_pipe_factory(CustomPipeFactory)
6builder.with_pipeline_factory(CustomPipelineFactory)
7r2r = builder.build()

Advanced Usage

For more complex scenarios, you can chain multiple customizations:

1from r2r import R2RBuilder, R2RConfig
2from r2r.pipes import CustomRAGPipe
3
4class MyCustomAuthProvider:
5 ...
6
7class MyCustomLLMProvider
8 ...
9
10class MyCustomRAGPipe
11
12
13config = R2RConfig.from_toml("path/to/config.toml")
14
15r2r = (
16 R2RBuilder(config=config)
17 .with_auth_provider(MyCustomAuthProvider())
18 .with_llm_provider(MyCustomLLMProvider())
19 .with_rag_pipe(MyCustomRAGPipe())
20 .build()
21)

This approach allows you to create highly customized R2R instances tailored to your specific needs.

Best Practices

  1. Configuration Management: Use separate configuration files for different environments (development, staging, production).
  2. Custom Components: When creating custom providers, pipes, or pipelines, ensure they adhere to the respective interfaces defined in the R2R framework.
  3. Testability: Create factory methods or builder configurations specifically for testing to easily mock or stub components.
  4. Logging: Enable appropriate logging in your custom components to aid in debugging and monitoring.
  5. Error Handling: Implement proper error handling in custom components and provide meaningful error messages.

By leveraging the R2RBuilder and customizing factories, you can create flexible, customized, and powerful R2R applications tailored to your specific use cases and requirements.