📚 Core Concepts
-
Learn how to systematically evaluate your AI applications using experiments.
Track changes, measure improvements, and compare results across different versions of your application.
-
Understand how to create, manage, and use evaluation datasets.
Learn about dataset structure, storage backends, and best practices for maintaining your test data.
-
Use our library of available metrics or create custom metrics tailored to your use case.
Metrics for evaluating RAG, Agentic workflows and more...
-
Generate high-quality datasets for comprehensive testing.
Algorithms for synthesizing data to test RAG, Agentic workflows