Skip to main content

Hallucination Policies

Overview

Hallucination policies enable the detection of hallucinated model responses for summarization and RAG systems.

  • Response Summarization Consistency: Evaluate and detect if the model’s response summaries are consistent with user-provided source text in the prompt input. Can be used with non-RAG systems.
  • Input Relevance: Evaluate and detect if the user-provided prompt inputs are aligned with the retrieved context for RAG systems. Inputs that are not related to the context will be categorized as “off topic”.
  • Response Relevance: Evaluate the quality of the response with respect to the (retrieved) context and query. It evaluates two submetrics:
    • Response Faithfulness: evaluate and detect if the model response is faithful and adheres to the provided context for RAG systems.
    • Response Relevance: evaluate and detect if the model's responses are aligned with user-provided prompt for RAG and non-RAG systems.

Hallucination Policy Actions

You can manage what happens to outputs violating the hallucination policies with the actions below:

  • Flag: flag content for moderator review
  • Block: block user inputs or model outputs containing hallucinated content