Write custom prompts with ragas¶
This is a tutorial notebook that shows how to create and use custom prompts with the metrics used in the evaluation task. This is achieved using Ragas prompt class. This tutorial will guide you to change and use different prompts with the Ragas metrics instead of the default ones used.
Dataset
Here I’m using a dataset from HuggingFace.
from datasets import load_dataset, Dataset
amnesty_dataset = load_dataset("explodinggradients/amnesty_qa", "english")
amnesty_dataset
DatasetDict({
train: Dataset({
features: ['question', 'ground_truth', 'answer', 'contexts'],
num_rows: 20
})
})
Create a Custom Prompt Object
Create a new Prompt object to be used in the metric for the evaluation task. For this task, I will be instantiating an object of the Ragas Prompt class.
from ragas.llms.prompt import Prompt
long_form_answer_prompt_new = Prompt(
name="long_form_answer_new_v1",
instruction="Create one or more statements from each sentence in the given answer.",
examples=[
{
"question": "Which is the only planet in the solar system that has life on it?",
"answer": "earth",
"statements": {
"statements": [
"Earth is the only planet in the solar system that has life on it."
]
},
},
{
"question": "Were Hitler and Benito Mussolini of the same nationality?",
"answer": "Sorry, I can't provide an answer to that question.",
"statements": {
"statements": []
},
},
],
input_keys=["question", "answer"],
output_key="statements",
output_type="json",
)
Using the Custom Prompt in Evaluations
I will be using the faithfulness metric for my evaluation task. Faithfulness uses two default prompts long_form_answer_prompt
and nli_statements_message
for evaluations. I will be changing the default long_form_answer_prompt
used in this metric to the newly created prompt object.
from ragas.metrics import faithfulness
faithfulness.long_form_answer_prompt = long_form_answer_prompt_new
print(faithfulness.long_form_answer_prompt.to_string())
Create one or more statements from each sentence in the given answer.
question: "Which is the only planet in the solar system that has life on it?"
answer: "earth"
statements: {{"statements": ["Earth is the only planet in the solar system that has life on it."]}}
question: "Were Hitler and Benito Mussolini of the same nationality?"
answer: "Sorry, I can't provide an answer to that question."
statements: {{"statements": []}}
question: {question}
answer: {answer}
statements:
Now the custom prompt that we created is being used in the faithfulness metric. We can now evaluate the dataset against the metric that uses the new prompt that we created.
from ragas import evaluate
result = evaluate(
dataset["train"].select(range(3)), # selecting only 3
metrics=[
faithfulness
],
)
result
evaluating with [faithfulness]
100%|██████████| 1/1 [02:31<00:00, 151.79s/it]
{'faithfulness': 0.7879}