RAG Query

Execute a RAG (Retrieval-Augmented Generation) query.

This endpoint combines search results with language model generation to produce accurate, contextually-relevant responses based on your document corpus.

Features:

  • Combines vector search, optional knowledge graph integration, and LLM generation
  • Automatically cites sources with unique citation identifiers
  • Supports both streaming and non-streaming responses
  • Compatible with various LLM providers (OpenAI, Anthropic, etc.)
  • Web search integration for up-to-date information

Search Configuration: All search parameters from the search endpoint apply here, including filters, hybrid search, and graph-enhanced search.

Generation Configuration: Fine-tune the language model’s behavior with rag_generation_config:

1{
2"model": "openai/gpt-4o-mini", // Model to use
3"temperature": 0.7, // Control randomness (0-1)
4"max_tokens": 1500, // Maximum output length
5"stream": true // Enable token streaming
6}

Model Support:

  • OpenAI models (default)
  • Anthropic Claude models (requires ANTHROPIC_API_KEY)
  • Local models via Ollama
  • Any provider supported by LiteLLM

Streaming Responses: When stream: true is set, the endpoint returns Server-Sent Events with the following types:

  • search_results: Initial search results from your documents
  • message: Partial tokens as they’re generated
  • citation: Citation metadata when sources are referenced
  • final_answer: Complete answer with structured citations

Example Response:

1{
2"generated_answer": "DeepSeek-R1 is a model that demonstrates impressive performance...[1]",
3"search_results": { ... },
4"citations": [
5 {
6 "id": "123456",
7 "object": "citation",
8 "payload": { ... }
9 }
10]
11}

Headers

AuthorizationstringRequired

Bearer authentication of the form Bearer <token>, where token is your auth token.

X-API-KeystringRequired

Request

This endpoint expects an object.
querystringRequired
search_modeenumOptional

Default value of custom allows full control over search settings.

Pre-configured search modes: basic: A simple semantic-based search. advanced: A more powerful hybrid search combining semantic and full-text. custom: Full control via search_settings.

If filters or limit are provided alongside basic or advanced, they will override the default settings for that mode.

Allowed values:
search_settingsobjectOptional

The search configuration object. If search_mode is custom, these settings are used as-is. For basic or advanced, these settings will override the default mode configuration.

Common overrides include filters to narrow results and limit to control how many results are returned.

rag_generation_configobjectOptional

Configuration for RAG generation

task_promptstringOptional

Optional custom prompt to override default

include_title_if_availablebooleanOptionalDefaults to false

Include document titles in responses when available

Response

Successful Response

Errors