Athina: Ragas Metrics on your Production Logs

Athina is a production monitoring and evaluation platform. Try the sandbox here.

You can use Athina with Ragas metrics to run evals on production logs, and get granular model performance metrics on your production data.

Athina Performance Metrics

For example, you can get insights like this visually:

  • What is my AnswerRelevancy score for queries related to refunds for customer id nike-usa

  • What is my Faithfulness score for product catalog queries using prompt catalog_answerer/v3 with model gpt-3.5-turbo

▷ Running Athina Programmatically

When you use Athina to run Ragas evals programmatically, you will be able to view the results on Athina’s UI like this 👇

View RAGAS Metrics on Athina

  1. Install Athina’s Python SDK:

pip install athina
  1. Create an account at After signing up, you will receive an API key.

Here’s a sample notebook you can follow:

  1. Run the code

import os
from athina.evals import (
from athina.loaders import RagasLoader
from athina.keys import AthinaApiKey, OpenAiApiKey
from import EvalRunner
import pandas as pd

# Set your API keys

# Load your dataset from a dictionary, json, or csv:
dataset = RagasLoader().load_json("raw_data.json")

# Configure the eval suite
eval_model = "gpt-3.5-turbo"
eval_suite = [

# Run the evaluation suite
batch_eval_result = EvalRunner.run_suite(
    max_parallel_evals=1,  # If you increase this, you may run into rate limits


▷ Configure Ragas to run automatically on your production logs

If you are logging your production inferences to Athina, you can configure Ragas metrics to run automatically against your production logs.

  1. Navigate to the Athina Dashboard

  2. Open the Evals page (lightning icon on the left)

  3. Click the “New Eval” button on the top right

  4. Select the Ragas tab

  5. Select the eval you want to configure

Set up Ragas on Athina UI

Learn more about Athina