$ the open-source LLM evaluation framework

Delivered by

Confident AI

Regression Testing for LLMs

LLM evaluation metrics to unit test LLM outputs in Python

Hyperparameter Discovery

Gain insights to quickly iterate towards optimal hyperparameters

Integrate with Popular Frameworks

Evaluate existing LLM applications built with other frameworks