Providers
Evals

Here's a more concise version of the LLM Evaluation Integration Guide, incorporating Parea as the default evaluation provider:

LLM Evaluation Integration Guide

This guide provides an overview of integrating LLM evaluation using DeepEval and Parea. DeepEval is an open-source framework for evaluating LLM applications, while Parea is a platform and SDK for AI engineers that provides tools for LLM evaluation, observability, and prompt playgrounds.

Key Features

  • DeepEval offers LLM evaluation metrics and bulk evaluation of datasets.
  • Parea provides debugging, testing, evaluating, and monitoring tools for LLM applications.
  • Parea is the default evaluation provider, with DeepEval as an alternative.

Integrations

Parea and DeepEval are integrated with deployed applications. Set the sampling rate in config.json.

QuickStart

  1. Install Parea (default) or DeepEval using pip.
  2. Set your OPENAI_API_KEY as an environment variable.
  3. Write and run your test case.

For more information, refer to the Parea GitHub (opens in a new tab) and DeepEval Documentation (opens in a new tab).

Summary

Parea and DeepEval provide solutions for evaluating and monitoring LLM applications, helping developers make informed decisions to improve performance.