Faithfulness

This measures the factual consistency of the generated answer against the given context. It is calculated from answer and retrieved context. The answer is scaled to (0,1) range. Higher the better.

The generated answer is regarded as faithful if all the claims that are made in the answer can be inferred from the given context. To calculate this a set of claims from the generated answer is first identified. Then each one of these claims are cross checked with given context to determine if it can be inferred from given context or not. The faithfulness score is given by divided by

(1)\[\text{Faithfulness score} = {|\text{Number of claims in the generated answer that can be inferred from given context}| \over |\text{Total number of claims in the generated answer}|}\]

Hint

Question: Where and when was Einstein born?

Context: Albert Einstein (born 14 March 1879) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time

High faithfulness answer: Einstein was born in Germany on 14th March 1879.

Low faithfulness answer: Einstein was born in Germany on 20th March 1879.

How was this calculated?

Let’s examine how faithfulness was calculated using the low faithfulness answer:

  • Step 1: Break the generated answer into individual statements.

    • Statements:

      • Statement 1: “Einstein was born in Germany.”

      • Statement 2: “Einstein was born on 20th March 1879.”

  • Step 2: For each of the generated statements, verify if it can be inferred from the given context.

    • Statement 1: Yes

    • Statement 2: No

  • Step 3: Use the formula depicted above to calculate faithfulness.

    \[\text{Faithfulness} = { \text{1} \over \text{2} } = 0.5\]

Example

Faithfulness metric with batch size 10
from ragas.metrics.faithfulness import Faithfulness
faithfulness = Faithfulness(
    batch_size = 10
)
# Dataset({
#     features: ['question','contexts','answer'],
#     num_rows: 25
# })
dataset: Dataset

results = faithfulness.score(dataset)