Agent — The most advanced AI retrieval system. Containerized, Retrieval-Augmented Generation (RAG) with a RESTful API.

R2R’s Agent system orchestrates Retrieval-Augmented Generation (RAG) to provide intelligent, multi-step reasoning over your data. By pairing large language models with R2R’s search capabilities, the agent can query your documents (and optionally the web), process queries in context, and return rich, cited answers.

Key Features

Conversation Integration: Context is tracked through Conversations, allowing follow-up questions
Hybrid Retrieval: Combines vector search, full-text search, and (optionally) knowledge graph retrieval
Streaming Responses: Optionally stream token-by-token outputs
Tool Usage: Potential for extended functionality, including local and web search

Note: The agent system is in active development. Future updates will introduce more tools, deeper conversation threading, and enhanced orchestration.

Basic Usage

Here’s a simple agent query:

1 from r2r import R2RClient
2 
3 client = R2RClient()
4 
5 # Single-turn agent call
6 response = client.retrieval.agent(
7     message={"role": "user", "content": "Who was Aristotle?"},
8     search_settings={"limit": 5},
9 )
10 print(response)
11 # -> { "completion": "...", "search_results": {...}, "conversation_id": "..."}

Follow-Up with Conversations

To maintain context, store the conversation_id and pass it on:

1 conversation_id = response["results"]["conversation_id"]
2 
3 follow_up = client.retrieval.agent(
4     message={"role": "user", "content": "What were his contributions to logic?"},
5     conversation_id=conversation_id
6 )

Tip: Use the Conversations API directly to manage messages, e.g., listing all user queries or archiving complete sessions.

Streaming Responses

Enable stream in rag_generation_config for real-time token-by-token output:

1 streaming_reply = client.retrieval.agent(
2     message={"role": "user", "content": "Explain quantum mechanics simply."},
3     rag_generation_config={
4         "stream": True,
5         "temperature": 0.7,
6         "max_tokens": 300
7     }
8 )
9 
10 print("Agent response:")
11 for chunk in streaming_reply:
12     print(chunk, end="", flush=True)

Advanced RAG Search

Customize the underlying retrieval with search_settings. For example, to require semantic search and filter by document_id:

1 response = client.retrieval.agent(
2     message={"role": "user", "content": "Summarize document ABC"},
3     search_settings={
4         "use_semantic_search": True,
5         "filters": {"document_id": {"$eq": "3e157b3a-8469-51db-..."}},
6         "limit": 10,
7     },
8     rag_generation_config={
9         "temperature": 0.2,
10         "max_tokens": 200
11     }
12 )

Multiple Tools (Beta)

You can enable external search tools in the r2r.toml under [agent]:

1 [agent]
2 tool_names = ["local_search", "web_search"]  # requires appropriate setup

When enabled, the agent can:

Search your local ingestion store (default)
Perform web searches for broader context (requires valid Serper or other API keys)

Integrations and Observability

Document Management: Your agent interacts with R2R documents ingested via Documents API.
Conversations: Manage context and user interactions via Conversations.
Logs & Analytics: Monitor agent usage through R2R’s logging and analytics.

Best Practices

Keep conversation_id: Passing it ensures the agent sees prior messages.
Tune search_settings: Fine-tune filters and semantic/hybrid search options.
Use generation_config: Adjust model, temperature, and max_tokens to match your desired style.
Streaming: Stream large responses for better UX.
Memory & Cleanup: Clear or delete old conversations if context is no longer needed.

Troubleshooting

Empty or irrelevant responses: Review search_settings filters, increase limit, or check your document ingestion.
Conversation confusion: Ensure the correct conversation_id is passed.
Timeouts: For very large documents or models, consider increasing your server’s timeouts or using streaming output.

Conclusion

R2R’s Agent feature transforms your document collections into an interactive knowledge base. With conversation context, advanced search integration, and streaming outputs, you can build robust, user-friendly AI applications. To dive deeper, explore:

Harness the agent for research, enterprise Q&A, or any scenario where context-driven, intelligent responses are needed.

1	from r2r import R2RClient
2
3	client = R2RClient()
4
5	# Single-turn agent call
6	response = client.retrieval.agent(
7	message={"role": "user", "content": "Who was Aristotle?"},
8	search_settings={"limit": 5},
9	)
10	print(response)
11	# -> { "completion": "...", "search_results": {...}, "conversation_id": "..."}

1	conversation_id = response["results"]["conversation_id"]
2
3	follow_up = client.retrieval.agent(
4	message={"role": "user", "content": "What were his contributions to logic?"},
5	conversation_id=conversation_id
6	)

1	streaming_reply = client.retrieval.agent(
2	message={"role": "user", "content": "Explain quantum mechanics simply."},
3	rag_generation_config={
4	"stream": True,
5	"temperature": 0.7,
6	"max_tokens": 300
7	}
8	)
9
10	print("Agent response:")
11	for chunk in streaming_reply:
12	print(chunk, end="", flush=True)

1	response = client.retrieval.agent(
2	message={"role": "user", "content": "Summarize document ABC"},
3	search_settings={
4	"use_semantic_search": True,
5	"filters": {"document_id": {"$eq": "3e157b3a-8469-51db-..."}},
6	"limit": 10,
7	},
8	rag_generation_config={
9	"temperature": 0.2,
10	"max_tokens": 200
11	}
12	)

1	[agent]
2	tool_names = ["local_search", "web_search"] # requires appropriate setup