Create a new document

Creates a new Document object from an input file, text content, or chunks. The chosen ingestion_mode determines how the ingestion process is configured:

Ingestion Modes:

hi-res: Comprehensive parsing and enrichment, including summaries and possibly more thorough parsing.
fast: Speed-focused ingestion that skips certain enrichment steps like summaries.
custom: Provide a full ingestion_config to customize the entire ingestion process.

Either a file or text content must be provided, but not both. Documents are shared through Collections which allow for tightly specified cross-user interactions.

The ingestion process runs asynchronously and its progress can be tracked using the returned task_id.

Request

This endpoint expects a multipart form.

filestringOptionalformat: "binary"

The file to ingest. Exactly one of file, raw_text, or chunks must be provided.

raw_textstringOptional

Raw text content to ingest. Exactly one of file, raw_text, or chunks must be provided.

chunksstringOptional

Pre-processed text chunks to ingest. Exactly one of file, raw_text, or chunks must be provided.

idstringOptionalformat: "uuid"

The ID of the document. If not provided, a new ID will be generated.

collection_idsstringOptional

Collection IDs to associate with the document. If none are provided, the document will be assigned to the user's default collection.

metadatastringOptional

Metadata to associate with the document, such as title, description, or custom fields.

ingestion_modeenumOptional

Ingestion modes:

hi-res: Thorough ingestion with full summaries and enrichment.
ocr: OCR via Mistral and full summaries.
fast: Quick ingestion with minimal enrichment and no summaries.
custom: Full control via ingestion_config.

If filters or limit (in ingestion_config) are provided alongside hi-res or fast, they will override the default settings for that mode.

Allowed values:

ingestion_configstringOptional

An optional dictionary to override the default chunking configuration for the ingestion process. If not provided, the system will use the default server-side chunking configuration.

run_with_orchestrationbooleanOptional

Whether or not ingestion runs with orchestration, default is True. When set to False, the ingestion process will run synchronous and directly return the result.

Response

Successful Response

resultsobject

1	from r2r import R2RClient
2
3	client = R2RClient()
4	# when using auth, do client.login(...)
5
6	response = client.documents.create(
7	file_path="pg_essay_1.html",
8	metadata={"metadata_1":"some random metadata"},
9	id=None
10	)

1	{
2	"results": {
3	"message": "Ingestion task queued successfully.",
4	"document_id": "9fbe403b-c11c-5aae-8ade-ef22980c3ad1",
5	"task_id": "c68dc72e-fc23-5452-8f49-d7bd46088a96"
6	}
7	}

Headers

Request

Response

Errors