Collections

A comprehensive guide to creating collections in R2R

Introduction

A collection in R2R is a logical grouping of users and documents that allows for efficient access control and organization. Collections enable you to manage permissions and access to documents at a group level, rather than individually.

R2R provides robust document collection management, allowing developers to implement efficient access control and organization of users and documents. This cookbook will guide you through the collection capabilities in R2R.

For user authentication, please refer to the User Auth Cookbook.

Collection permissioning in R2R is still under development and as a result the is likely to API continue evolving in future releases.

A diagram showing user and collection management across r2r

Basic Usage

Collections currently follow a flat hierarchy wherein superusers are responsible for management operations. This functionality will expand as development on R2R continues.

Collection CRUD operations

Let’s start by creating a new collection:

1from r2r import R2RClient
2
3client = R2RClient("http://localhost:7272") # Replace with your R2R deployment URL
4
5# Assuming you're logged in as an admin or a user with appropriate permissions
6# For testing, the default R2R implementation will grant superuser privileges to anon api calls
7collection_result = client.collections.create("Marketing Team", "Collection for marketing department")
8
9print(f"Collection creation result: {collection_result}")
10# {'results': {'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Marketing Team', 'description': 'Collection for marketing department', 'created_at': '2024-07-16T22:53:47.524794Z', 'updated_at': '2024-07-16T22:53:47.524794Z'}}

To retrieve details about a specific collection:

1collection_id = '123e4567-e89b-12d3-a456-426614174000' # Use the collection_id from the creation result
2collection_details = client.collections.retrieve(collection_id)
3
4print(f"Collection details: {collection_details}")
5# {'results': {'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Marketing Team', 'description': 'Collection for marketing department', 'created_at': '2024-07-16T22:53:47.524794Z', 'updated_at': '2024-07-16T22:53:47.524794Z'}}

You can update a collection’s name or description:

1update_result = client.collections.update(
2 collection_id,
3 name="Updated Marketing Team",
4 description="New description for marketing team"
5)
6
7print(f"Collection update result: {update_result}")
8# {'results': {'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', 'description': 'New description for marketing team', 'created_at': '2024-07-16T22:53:47.524794Z', 'updated_at': '2024-07-16T23:15:30.123456Z'}}

Lastly, you can delete a collection

Listing Collections

1client.collections.delete(collection_id)

To get a list of all collections:

1collections_list = client.collections.list()
2
3print(f"Collections list: {collections_list}")
4# {'results': [{'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', 'description': 'New description for marketing team', 'created_at': '2024-07-16T22:53:47.524794Z', 'updated_at': '2024-07-16T23:15:30.123456Z'}, ...]}

User Management in Collections

Adding a User to a Collection

To add a user to a collection, you need both the user’s ID and the collections’s ID:

1user_id = '456e789f-g01h-34i5-j678-901234567890' # This should be a valid user ID
2collection_id = '123e4567-e89b-12d3-a456-426614174000' # this should be a collection I own
3add_user_result = client.collections.add_user(user_id, collection_id)
4
5print(f"Add user to collection result: {add_user_result}")
6# {'results': {'message': 'User successfully added to the collection'}}

Removing a User from a Collections

Similarly, to remove a user from a collection:

1remove_user_result = client.collections.remove_user(user_id, collection_id)
2
3print(f"Remove user from collection result: {remove_user_result}")
4# {'results': None}

Listing Users in a Collection

To get a list of all users in a specific collection:

1users_in_collection = client.collections.list_users(collection_id)
2
3print(f"Users in collection: {users_in_collection}")
4# {'results': [{'user_id': '456e789f-g01h-34i5-j678-901234567890', 'email': '[email protected]', 'name': 'John Doe', ...}, ...]}

Getting Collections for a User

To get all collections that a user is a member of:

1user.list_collections = client.user.list_collections(user_id)
2
3print(f"User's collections: {user.list_collections}")
4# {'results': [{'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', ...}, ...]}

Document Management in Collections

Assigning a Document to a Collection

To assign a document to a collection:

1document_id = '789g012j-k34l-56m7-n890-123456789012' # This should be a valid document ID
2assign_doc_result = client.collections.add_document(collection_id, document_id)
3
4print(f"Assign document to collection result: {assign_doc_result}")
5# {'results': {'message': 'Document successfully assigned to the collection'}}

Removing a Document from a Collection

To remove a document from a collection:

1remove_doc_result = client.collections.remove_document(collection_id, document_id)
2
3print(f"Remove document from collection result: {remove_doc_result}")
4# {'results': {'message': 'Document successfully removed from the collection'}}

Listing Documents in a Collection

To get a list of all documents in a specific collection:

1docs_in_collection = client.collections.list_documents(collection_id)
2
3print(f"Documents in collection: {docs_in_collection}")
4# {'results': [{'document_id': '789g012j-k34l-56m7-n890-123456789012', 'title': 'Marketing Strategy 2024', ...}, ...]}

Getting Collections for a Document

To get all collections that a document is assigned to:

1documents.list_collections = client.documents.list_collections(document_id)
2
3print(f"Document's collections: {documents.list_collections}")
4# {'results': [{'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', ...}, ...]}

Advanced Collection Management

Generating Synthetic Descriptions

To have an LLM generate a description for a collection, you can run:

1update_result = client.collections.update(
2 collection_id,
3 generate_description=True
4)
5
6print(f"Collection update result: {update_result}")
7# {'results': {'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', 'description': 'A rich description generated over the summaries of the documents in the collection', 'created_at': '2024-07-16T22:53:47.524794Z', 'updated_at': '2024-07-16T23:15:30.123456Z'}}

This is particularly helpful when building graphs as the summary provides high-quality context in the prompt, resulting in better descriptions.

Collection Overview

To get an overview of collection, including user and document counts:

1collections.list = client.collections.list()
2
3print(f"Collections overview: {collections.list}")
4# {'results': [{'collection_id': '123e4567-e89b-12d3-a456-426614174000', 'name': 'Updated Marketing Team', 'description': 'New description for marketing team', 'user_count': 5, 'document_count': 10, ...}, ...]}

Deleting a Collection

To delete a collection:

1delete_result = client.delete_collection(collection_id)
2
3print(f"Delete collection result: {delete_result}")
4# {'results': {'message': 'Collection successfully deleted'}}

Pagination and Filtering

Many of the collection-related methods support pagination and filtering. Here are some examples:

1# List collections with pagination
2paginated_collection = client.collections.list(offset=10, limit=20)
3
4# Get users in a collection with pagination
5paginated_users = client.collections.list_users(collection_id, offset=5, limit=10)
6
7# Get documents in a collection with pagination
8paginated_docs = client.collections.list_documents(collection_id, offset=0, limit=50)
9
10# Get collections overview with specific collection IDs
11specific_collections.list = client.collections.list(collection_ids=['id1', 'id2', 'id3'])

Security Considerations

When implementing collection permissions, consider the following security best practices:

  1. Least Privilege Principle: Assign the minimum necessary permissions to users and collections.
  2. Regular Audits: Periodically review collection memberships and document assignments.
  3. Access Control: Ensure that only authorized users (e.g., admins) can perform collection management operations.
  4. Logging and Monitoring: Implement comprehensive logging for all collection-related actions.

Customizing Collection Permissions

While R2R’s current collection system follows a flat hierarchy, you can build more complex permission structures on top of it:

  1. Custom Roles: Implement application-level roles within collections (e.g., collection admin, editor, viewer).
  2. Hierarchical Collections: Create a hierarchy by establishing parent-child relationships between collections in your application logic.
  3. Permission Inheritance: Implement rules for permission inheritance based on collection memberships.

Troubleshooting

Here are some common issues and their solutions:

  1. Unable to Create/Modify Collections: Ensure the user has superuser privileges.
  2. User Not Seeing Collection Content: Verify that the user is correctly added to the collection and that documents are properly assigned.
  3. Performance Issues with Large Collections: Use pagination when retrieving users or documents in large collections.

Conclusion

R2R’s collection permissioning system provides a foundation for implementing sophisticated access control in your applications. As the feature set evolves, more advanced capabilities will become available. Stay tuned to the R2R documentation for updates and new features related to collection permissions.

For user authentication and individual user management, refer to the User Auth Cookbook. For more advanced use cases or custom implementations, consult the R2R documentation or reach out to the community for support.

Built with