Search & RAG
Search and Retrieval-Augmented Generation with R2R.
Occasionally this SDK documentation falls out of date, cross-check with the automatcially generated API Reference documentation for the latest parameters.
AI Powered Search
Search
Perform a basic vector search:
search_response = client.search("What was Uber's profit in 2020?")
The search query.
Optional settings for vector search, either a dictionary, a VectorSearchSettings
object, or None
may be passed. If a dictionary or None
is passed, then R2R will use server-side defaults for non-specified fields.
Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings
object, or None
may be passed. If a dictionary or None
is passed, then R2R will use server-side defaults for non-specified fields.
Search custom settings
Learn more about the search API here. It allows searching with custom settings, such as bespoke document filters and larger search limits:
# returns only chunks from documents with title `document_title`
filtered_search_response = client.search(
"What was Uber's profit in 2020?",
vector_search_settings={
# restrict results to the Uber document
"filters": {"title": {"$eq": "uber_2021.pdf"}},
"search_limit": 100
}
)
Hybrid Search
Learn more about the dedicated knowledge graph capabilities in R2R here. Combine traditional keyword-based search with vector search:
hybrid_search_response = client.search(
"What was Uber's profit in 2020?",
vector_search_settings={
"use_hybrid_search": True,
"search_limit": 20,
"hybrid_search_settings": {
"full_text_weight": 1.0,
"semantic_weight": 10.0,
"full_text_limit": 200,
"rrf_k": 25,
},
}
)
Knowledge Graph Search
Learn more about the dedicated knowledge graph capabilities in R2R here. You can utilize knowledge graph capabilities to enhance search results, as shown below:
kg_search_response = client.search(
"What is airbnb",
vector_search_settings={"use_vector_search": False},
kg_search_settings={
"use_kg_search": True,
"kg_search_type": "local",
"kg_search_level": "0",
"generation_config": {
"model": "openai/gpt-4o-mini",
"temperature": 0.7,
},
"local_search_limits": {
"__Entity__": 20,
"__Relationship__": 20,
"__Community__": 20,
},
"max_community_description_length": 65536,
"max_llm_queries_for_global_search": 250
}
)
Retrieval-Augmented Generation (RAG)
Basic RAG
Generate a response using RAG:
rag_response = client.rag("What was Uber's profit in 2020?")
The query for RAG.
Optional settings for vector search, either a dictionary, a VectorSearchSettings
object, or None
may be passed. If a dictionary is used, non-specified fields will use the server-side default.
Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings
object, or None
may be passed. If a dictionary or None
is passed, then R2R will use server-side defaults for non-specified fields.
Optional configuration for LLM to use during RAG generation, including model selection and parameters. Will default to values specified in r2r.toml
.
Optional custom prompt to override the default task prompt.
Augment document chunks with their respective document titles?
RAG with custom search settings
Learn more about the RAG API here. It allows performing RAG with custom settings, such as hybrid search:
hybrid_rag_response = client.rag(
"Who is Jon Snow?",
vector_search_settings={"use_hybrid_search": True}
)
RAG with custom completion LLM
R2R supports configuration on server-side and at runtime, which you can read about here. An example below, using Anthropic at runtime:
anthropic_rag_response = client.rag(
"What is R2R?",
rag_generation_config={"model":"anthropic/claude-3-opus-20240229"}
)
Streaming RAG
R2R supports streaming RAG responses for real-time applications:
stream_response = client.rag(
"Who was Aristotle?",
rag_generation_config={"stream": True}
)
for chunk in stream_response:
print(chunk, end='', flush=True)
Advanced RAG Techniques
R2R supports advanced Retrieval-Augmented Generation (RAG) techniques that can be easily configured at runtime. These techniques include Hypothetical Document Embeddings (HyDE) and RAG-Fusion, which can significantly enhance the quality and relevance of retrieved information.
To use an advanced RAG technique, you can specify the search_strategy
parameter in your vector search settings:
from r2r import R2RClient
client = R2RClient()
# Using HyDE
hyde_response = client.rag(
"What are the main themes in Shakespeare's plays?",
vector_search_settings={
"search_strategy": "hyde",
"search_limit": 10
}
)
# Using RAG-Fusion
rag_fusion_response = client.rag(
"Explain the theory of relativity",
vector_search_settings={
"search_strategy": "rag_fusion",
"search_limit": 20
}
)
For a comprehensive guide on implementing and optimizing advanced RAG techniques in R2R, including HyDE and RAG-Fusion, please refer to our Advanced RAG Cookbook.
Customizing RAG
Putting everything together for highly customized RAG functionality at runtime:
custom_rag_response = client.rag(
"Who was Aristotle?",
vector_search_settings={
"use_hybrid_search": True,
"search_limit": 20,
"hybrid_search_settings": {
"full_text_weight": 1.0,
"semantic_weight": 10.0,
"full_text_limit": 200,
"rrf_k": 25,
},
},
kg_search_settings={
"use_kg_search": True,
"kg_search_type": "local",
},
rag_generation_config={
"model": "anthropic/claude-3-haiku-20240307",
"temperature": 0.7,
"stream": True
},
task_prompt_override="Only answer the question if the context is SUPER relevant!!\n\nQuery:\n{query}\n\nContext:\n{context}"
)
Agents
Multi-turn agentic RAG
The R2R application includes agents which come equipped with a search tool, enabling them to perform RAG. Using the R2R Agent for multi-turn conversations:
messages = [
{"role": "user", "content": "What was Aristotle's main contribution to philosophy?"},
{"role": "assistant", "content": "Aristotle made numerous significant contributions to philosophy, but one of his main contributions was in the field of logic and reasoning. He developed a system of formal logic, which is considered the first comprehensive system of its kind in Western philosophy. This system, often referred to as Aristotelian logic or term logic, provided a framework for deductive reasoning and laid the groundwork for scientific thinking."},
{"role": "user", "content": "Can you elaborate on how this influenced later thinkers?"}
]
rag_agent_response = client.agent(
messages,
vector_search_settings={"use_hybrid_search":True},
)
Note that any of the customization seen in AI powered search and RAG documentation above can be applied here.
The list of messages to pass the RAG agent.
Optional settings for vector search, either a dictionary, a VectorSearchSettings
object, or None
may be passed. If a dictionary is used, non-specified fields will use the server-side default.
Optional settings for knowledge graph search, either a dictionary, a KGSearchSettings
object, or None
may be passed. If a dictionary or None
is passed, then R2R will use server-side defaults for non-specified fields.
Optional configuration for LLM to use during RAG generation, including model selection and parameters. Will default to values specified in r2r.toml
.
Optional custom prompt to override the default task prompt.
Multi-turn agentic RAG with streaming
The response from the RAG agent may be streamed directly back
messages = [
{"role": "user", "content": "What was Aristotle's main contribution to philosophy?"},
{"role": "assistant", "content": "Aristotle made numerous significant contributions to philosophy, but one of his main contributions was in the field of logic and reasoning. He developed a system of formal logic, which is considered the first comprehensive system of its kind in Western philosophy. This system, often referred to as Aristotelian logic or term logic, provided a framework for deductive reasoning and laid the groundwork for scientific thinking."},
{"role": "user", "content": "Can you elaborate on how this influenced later thinkers?"}
]
rag_agent_response = client.agent(
messages,
vector_search_settings={"use_hybrid_search":True},
rag_generation_config={"stream":True}
)
Was this page helpful?