Skip to main content

Thresholding

Thresholding allows you to fine-tune the behavior of policies in DynamoGuard by adjusting classification thresholds. These thresholds help balance the trade-off between metrics such as recall and false positive rate, enabling you to align policy performance with your enterprise requirements. Note: setting thresholds is optional.

Thresholding for Content Policies

For Content Policies, thresholds are classification thresholds that determine the confidence level required to flag or block content. Adjusting these thresholds affects which content is caught as compliant or non-compliant.

  • For input content policies, prompts with scores greater than the set threshold are marked as non-compliant
  • For output content policies, responses with scores less than the set threshold are marked as non-compliant

Thresholding for Hallucination Policies

For Hallucination Policies, thresholds are based on hallucination scores generated by the evaluation model. For hallucination policies, higher scores indicate better compliance. Thresholds indicate the following for each metric:

  1. Summarization Consistency: The probability that a response is consistent with the prompt
  2. RAG Hallucination - Input Relevance: The probability that the input is relevant to the context
  3. RAG Hallucination - Response Relevance: The probability that the model response is relevant to the context or input