R2R uses factories along with a builder pattern to create an instance of the R2RCore. This approach allows for seamless customization and overriding of default components.

Basic Customization Example

This example demonstrates how to override the default vector search module with a custom web search module:

filename="r2r/examples/scripts/run_web_search.py"
from r2r import (
    GenerationConfig,
    R2RBuilder,
    WebSearchPipe,
    SerperClient,
)

if __name__ == "__main__":
    # Initialize a web search pipe that uses Serper to scrape Google
    web_search_pipe = WebSearchPipe(serper_client=SerperClient())

    # Build the R2R application with the custom search pipe
    app = R2RBuilder().with_search_pipe(web_search_pipe).build()

    # Run the RAG pipeline through the R2R application
    result = app.rag(
        "Who was Aristotle?",
        rag_generation_config=GenerationConfig(model="gpt-4o"),
    )

    # Print the final result
    print(result)

To run this example:

export SERPER_API_KEY=...
python -m r2r.examples.scripts.run_web_search --query="Who is Aristotle?"

Advanced Customization Example

Multi-Web Search with Factory Overrides

This example uses a factory override to customize the pipeline further by including a query transformation pipe and a multi-search pipe:

filename="r2r/examples/scripts/run_web_multi_search.py"
from r2r import (
    GenerationConfig,
    R2RBuilder,
    R2RPipeFactoryWithMultiSearch,
    WebSearchPipe,
    SerperClient,
)

if __name__ == "__main__":
    # Initialize a web search pipe
    web_search_pipe = WebSearchPipe(serper_client=SerperClient())

    # Define a new synthetic query generation template
    synthetic_query_generation_template = {
        "template": """
            ### Instruction:
            Given the following query, write a double newline separated list of up to {num_outputs} advanced queries meant to help answer the original query.
            DO NOT generate any single query which is likely to require information from multiple distinct documents.
            EACH single query will be used to carry out a cosine similarity semantic search over distinct indexed documents.
            FOR EXAMPLE, if asked `how do the key themes of Great Gatsby compare with 1984`, the two queries would be
            `What are the key themes of Great Gatsby?` and `What are the key themes of 1984?`.
            Here is the original user query to be transformed into answers:

            ### Query:
            {message}

            ### Response:
            """,
        "input_types": {
            "num_outputs": "int",
            "message": "str"
        },
    }

    # Build the R2R application with the custom pipeline
    app = (
        R2RBuilder()
        .with_pipe_factory(R2RPipeFactoryWithMultiSearch)
        .build(
            # override inputs consumed by `R2RPipeFactoryWithMultiSearch.create_vector_search_pipe`
            multi_inner_search_pipe_override=web_search_pipe,
            query_generation_template_override=synthetic_query_generation_template,
        )
    )

    # Run the RAG pipeline through the R2R application
    result = app.rag(
        "Who was Aristotle?",
        rag_generation_config=GenerationConfig(model="gpt-4o"),
    )

    print(f"Final Result:\n\n{result}")

To run this example:

export SERPER_API_KEY=...
python -m r2r.examples.scripts.run_web_multi_search --query="Who is Aristotle?"

R2R App Builder

The R2RBuilder class utilizes factories and developer-specified overrides to create and customize an instance of the R2RCore:

R2R Factories

R2R includes a set of factories that allow for the creation and customization of various components required to build and run an R2RCore.

R2RProviderFactory

The R2RProviderFactory is responsible for creating various providers that the R2RCore relies on.

Key Methods

  • create_vector_db_provider
  • create_embedding_provider
  • create_eval_provider
  • create_llm_provider
  • create_prompt_provider
  • create_providers

R2RPipeFactory

The R2RPipeFactory is responsible for creating the pipes used in the R2RCore.

Key Methods

  • create_pipes
  • create_parsing_pipe
  • create_embedding_pipe
  • create_vector_storage_pipe
  • create_vector_search_pipe
  • create_rag_pipe
  • create_eval_pipe

R2RPipelineFactory

The R2RPipelineFactory is responsible for creating the various pipelines used in the R2RCore.

Key Methods

  • create_ingestion_pipeline
  • create_search_pipeline
  • create_rag_pipeline
  • create_eval_pipeline
  • create_pipelines

Custom Pipe Factory Example

Here’s an example of how to override the create_vector_search_pipe method of the R2RPipeFactory to use a MultiSearchPipe:

class R2RPipeFactoryWithMultiSearch(R2RPipeFactory):
    def create_vector_search_pipe(self, *args, **kwargs):
        multi_search_config = MultiSearchPipe.PipeConfig()
        task_prompt_name = kwargs.get("task_prompt_name") or f"{multi_search_config.name}_task_prompt"

        query_transform_pipe = kwargs.get("multi_query_transform_pipe_override", None) or QueryTransformPipe(
            llm_provider=self.providers.llm,
            prompt_provider=self.providers.prompt,
            config=QueryTransformPipe.QueryTransformConfig(
                name=multi_search_config.name,
                task_prompt=task_prompt_name,
            ),
        )

        if kwargs.get("task_prompt_name") is None:
            self.providers.prompt.add_prompt(
                name=task_prompt_name,
                **(kwargs.get("query_generation_template_override") or self.QUERY_GENERATION_TEMPLATE),
            )

        inner_search_pipe = kwargs.get("multi_inner_search_pipe_override", None) or super().create_vector_search_pipe(*args, **kwargs)
        inner_search_pipe.config.name = multi_search_config.name

        return MultiSearchPipe(
            query_transform_pipe=query_transform_pipe,
            inner_search_pipe=inner_search_pipe,
            config=multi_search_config,
        )

To use this custom factory:

# Build the R2R application with the custom factory
app = (
    R2RBuilder()
    .with_pipe_factory(R2RPipeFactoryWithMultiSearch)
    .build(
        # override inputs consumed in building the MultiSearchPipe
        multi_inner_search_pipe_override=web_search_pipe,
        query_generation_template_override=synthetic_query_generation_template,
    )
)

By using factories and the builder pattern, R2R enables easy customization and extension of pipelines. You can seamlessly override default components with your own implementations to fit specific needs.