Occasionally this SDK documentation falls out of date, cross-check with the automatcially generated API Reference documentation for the latest parameters.

Perform a basic vector search:

search_response = client.search("What was Uber's profit in 2020?")
query
str
required

The search query.

vector_search_settings
Optional[Union[VectorSearchSettings, dict]]
default: "None"

Optional settings for vector search, either a dictionary, a VectorSearchSettings object, or None may be passed. If a dictionary or None is passed, then R2R will use server-side defaults for non-specified fields.

kg_search_settings
Optional[Union[KGSearchSettings, dict]]
default: "None"

Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings object, or None may be passed. If a dictionary or None is passed, then R2R will use server-side defaults for non-specified fields.

Search custom settings

Learn more about the search API here. It allows searching with custom settings, such as bespoke document filters and larger search limits:

# returns only chunks from documents with title `document_title`
filtered_search_response = client.search(
    "What was Uber's profit in 2020?",
    vector_search_settings={
        # restrict results to the Uber document
        "filters": {"title": {"$eq": "uber_2021.pdf"}},
        "search_limit": 100
    }
)

Learn more about the dedicated knowledge graph capabilities in R2R here. Combine traditional keyword-based search with vector search:

hybrid_search_response = client.search(
    "What was Uber's profit in 2020?",
    vector_search_settings={
        "use_hybrid_search": True,
        "search_limit": 20,
        "hybrid_search_settings": {
            "full_text_weight": 1.0,
            "semantic_weight": 10.0,
            "full_text_limit": 200,
            "rrf_k": 25,
        },
    }
)

Learn more about the dedicated knowledge graph capabilities in R2R here. You can utilize knowledge graph capabilities to enhance search results, as shown below:

kg_search_response = client.search(
    "What is airbnb",
    vector_search_settings={"use_vector_search": False},
    kg_search_settings={
      "use_kg_search": True,
      "kg_search_type": "local",
      "kg_search_level": "0",
      "generation_config": {
          "model": "openai/gpt-4o-mini",
          "temperature": 0.7,
      },
      "local_search_limits": {
          "__Entity__": 20,
          "__Relationship__": 20,
          "__Community__": 20,
      },
      "max_community_description_length": 65536,
      "max_llm_queries_for_global_search": 250
    }
)

Retrieval-Augmented Generation (RAG)

Basic RAG

Generate a response using RAG:

rag_response = client.rag("What was Uber's profit in 2020?")
query
str
required

The query for RAG.

vector_search_settings
Optional[Union[VectorSearchSettings, dict]]
default: "None"

Optional settings for vector search, either a dictionary, a VectorSearchSettings object, or None may be passed. If a dictionary is used, non-specified fields will use the server-side default.

kg_search_settings
Optional[Union[KGSearchSettings, dict]]
default: "None"

Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings object, or None may be passed. If a dictionary or None is passed, then R2R will use server-side defaults for non-specified fields.

rag_generation_config
Optional[Union[GenerationConfig, dict]]
default: "None"

Optional configuration for LLM to use during RAG generation, including model selection and parameters. Will default to values specified in r2r.toml.

task_prompt_override
Optional[str]
default: "None"

Optional custom prompt to override the default task prompt.

include_title_if_available
Optional[bool]
default: "True"

Augment document chunks with their respective document titles?

RAG with custom search settings

Learn more about the RAG API here. It allows performing RAG with custom settings, such as hybrid search:

hybrid_rag_response = client.rag(
    "Who is Jon Snow?",
    vector_search_settings={"use_hybrid_search": True}
)

RAG with custom completion LLM

R2R supports configuration on server-side and at runtime, which you can read about here. An example below, using Anthropic at runtime:

anthropic_rag_response = client.rag(
    "What is R2R?",
    rag_generation_config={"model":"anthropic/claude-3-opus-20240229"}
)

Streaming RAG

R2R supports streaming RAG responses for real-time applications:

stream_response = client.rag(
    "Who was Aristotle?",
    rag_generation_config={"stream": True}
)
for chunk in stream_response:
    print(chunk, end='', flush=True)

Advanced RAG Techniques

R2R supports advanced Retrieval-Augmented Generation (RAG) techniques that can be easily configured at runtime. These techniques include Hypothetical Document Embeddings (HyDE) and RAG-Fusion, which can significantly enhance the quality and relevance of retrieved information.

To use an advanced RAG technique, you can specify the search_strategy parameter in your vector search settings:

from r2r import R2RClient

client = R2RClient()

# Using HyDE
hyde_response = client.rag(
    "What are the main themes in Shakespeare's plays?",
    vector_search_settings={
        "search_strategy": "hyde",
        "search_limit": 10
    }
)

# Using RAG-Fusion
rag_fusion_response = client.rag(
    "Explain the theory of relativity",
    vector_search_settings={
        "search_strategy": "rag_fusion",
        "search_limit": 20
    }
)

For a comprehensive guide on implementing and optimizing advanced RAG techniques in R2R, including HyDE and RAG-Fusion, please refer to our Advanced RAG Cookbook.

Customizing RAG

Putting everything together for highly customized RAG functionality at runtime:


custom_rag_response = client.rag(
    "Who was Aristotle?",
    vector_search_settings={
        "use_hybrid_search": True,
        "search_limit": 20,
        "hybrid_search_settings": {
            "full_text_weight": 1.0,
            "semantic_weight": 10.0,
            "full_text_limit": 200,
            "rrf_k": 25,
        },
    },
    kg_search_settings={
        "use_kg_search": True,
        "kg_search_type": "local",
    },
    rag_generation_config={
        "model": "anthropic/claude-3-haiku-20240307",
        "temperature": 0.7,
        "stream": True
    },
    task_prompt_override="Only answer the question if the context is SUPER relevant!!\n\nQuery:\n{query}\n\nContext:\n{context}"
)

Agents

Multi-turn agentic RAG

The R2R application includes agents which come equipped with a search tool, enabling them to perform RAG. Using the R2R Agent for multi-turn conversations:

messages = [
    {"role": "user", "content": "What was Aristotle's main contribution to philosophy?"},
    {"role": "assistant", "content": "Aristotle made numerous significant contributions to philosophy, but one of his main contributions was in the field of logic and reasoning. He developed a system of formal logic, which is considered the first comprehensive system of its kind in Western philosophy. This system, often referred to as Aristotelian logic or term logic, provided a framework for deductive reasoning and laid the groundwork for scientific thinking."},
    {"role": "user", "content": "Can you elaborate on how this influenced later thinkers?"}
]

rag_agent_response = client.agent(
    messages,
    vector_search_settings={"use_hybrid_search":True},
)

Note that any of the customization seen in AI powered search and RAG documentation above can be applied here.

messages
list[Messages]
required

The list of messages to pass the RAG agent.

vector_search_settings
Optional[Union[VectorSearchSettings, dict]]
default: "None"

Optional settings for vector search, either a dictionary, a VectorSearchSettings object, or None may be passed. If a dictionary is used, non-specified fields will use the server-side default.

kg_search_settings
Optional[Union[KGSearchSettings, dict]]
default: "None"

Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings object, or None may be passed. If a dictionary or None is passed, then R2R will use server-side defaults for non-specified fields.

rag_generation_config
Optional[Union[GenerationConfig, dict]]
default: "None"

Optional configuration for LLM to use during RAG generation, including model selection and parameters. Will default to values specified in r2r.toml.

task_prompt_override
Optional[str]
default: "None"

Optional custom prompt to override the default task prompt.

Multi-turn agentic RAG with streaming

The response from the RAG agent may be streamed directly back

messages = [
    {"role": "user", "content": "What was Aristotle's main contribution to philosophy?"},
    {"role": "assistant", "content": "Aristotle made numerous significant contributions to philosophy, but one of his main contributions was in the field of logic and reasoning. He developed a system of formal logic, which is considered the first comprehensive system of its kind in Western philosophy. This system, often referred to as Aristotelian logic or term logic, provided a framework for deductive reasoning and laid the groundwork for scientific thinking."},
    {"role": "user", "content": "Can you elaborate on how this influenced later thinkers?"}
]

rag_agent_response = client.agent(
    messages,
    vector_search_settings={"use_hybrid_search":True},
    rag_generation_config={"stream":True}
)