Hallucination evaluation for summarisation models
Overview
The NLI consistency test measures the logical consistency between an input text (or document) and a model-generated summary. The NLI consistency evaluation is conducted by providing a summarization model with a set of input documents and scoring each (document, summary) pair as being entailing, contradicting, or neutral using a Natural Language Inference (NLI) model. Entailing indicates that the summary logically implies the content in the document while contradicting indicates otherwise, and neutral indicates that no logical relationship can be drawn. The NLI consistency test uses the entailment probability.
The UniEval factuality evaluation test measures the factual support between an input text (or document) and a model-generated summary. The UniEval factuality score is calculated by providing a summarization model with a set of input documents and scoring each (document, summary) pair using an LLM as an evaluator. The evaluator LLM used has been specifically trained on a set of boolean question-answer prompts related to detecting factual consistency, and has been found to significantly outperform various state-of-the-art evaluators.
Metrics
NLI Consistency Score: This is a score between 0 and 1, representing the average degree to which the summaries logically imply the contents of the input text. 0 indicates a low degree of entailment and a high degree of hallucination, while 1 indicates a high degree of entailment and a low degree of hallucination.
UniEval Factuality Score: Factual consistency is represented as a score between 0 and 1, with 0 implying a low degree of factuality and a high degree of hallucination and 1 indicating a high degree of entailment and a low degree of hallucination.
Example1:
- Original Document
The children, who had previously lived in areas controlled by Boko Haram, were held in a military barracks in the north-eastern city of Maiduguri, a UN spokesman told the BBC. Details of the children's ages and the length of their detention have not been given. The army has not made any comment. Human rights groups argue that there is no proper legal process for civilians, including children, who are detained by the army as part of their counter-insurgency operations. "We fear that there are still kids who are being at least temporarily detained because they are being released from Boko Haram areas by the army but then kept for a while," Manuel Fontaine, regional director for the UN Children's fund (Unicef) in Central and West Africa, told Reuters news agency. Earlier this month, 21 of the more than 200 Chibok schoolgirls abducted by Boko Haram two years ago were released and reunited with their families. Nigeria has been fighting a seven-year insurgency against Boko Haram, with the army retaking much of the territory under the Islamist militants' control in the past 20 months.
- Summary
Children who were previously living in areas controlled by Boko Haram were held in a military barracks in Maiduguri. The ages of the children and the length of their detention have not been disclosed. Human rights groups are concerned about the lack of proper legal process for civilians, including children, detained by the army during insurgency operations.
- NLI Consistency Score:
0.9813
- UniEval Factuality Score:
0.9777
Example2:
- Original Document
The children, who had previously lived in areas controlled by Boko Haram, were held in a military barracks in the north-eastern city of Maiduguri, a UN spokesman told the BBC. Details of the children's ages and the length of their detention have not been given. The army has not made any comment. Human rights groups argue that there is no proper legal process for civilians, including children, who are detained by the army as part of their counter-insurgency operations. "We fear that there are still kids who are being at least temporarily detained because they are being released from Boko Haram areas by the army but then kept for a while," Manuel Fontaine, regional director for the UN Children's fund (Unicef) in Central and West Africa, told Reuters news agency. Earlier this month, 21 of the more than 200 Chibok schoolgirls abducted by Boko Haram two years ago were released and reunited with their families. Nigeria has been fighting a seven-year insurgency against Boko Haram, with the army retaking much of the territory under the Islamist militants' control in the past 20 months.
- Summary
Boko Haram last December released many abducted students, in return for a control over the the north-eastern region of Maiduguri. Manuel Fontain told Reuters that there are many more children who are still abducted, and he expressed strong determination for the release of the children with official statements from Unicef. Nigeria has been fighting a ten-year war against Boko Haram.
- NLI Consistency Score:
0.0002
- UniEval Factuality Score:
0.2691
We threshold the raw score to obtain a binary label of good / bad for each (document, summary) parir. The final performance metric we report is the rate at which the data point scores good in either NLI consistency score or UniEval Factuality Score. For instance, if 100 data points are provided, among which 53 had good NLI consistency scores, and 49 had good UniEval factuality scores, the final performance score we report would be 0.53 for NLI consistency test and 0.49 for UniEval factuality test.