Skip to main content

PII Detection

Overview

PII detection policies enable the detection, redaction, or sanitization of sensitive Personally Identifiable Information (PII) in user-provided inputs or model responses. These policies ensure protection against the leakage of sensitive information and can help your organization be compliant with regulations like GDPR.

PII Entities

Personally Identifiable Information (PII) is information that can uniquely identify a specific individual or organization. DynamoGuard currently supports the following PII entities by default. Note: this can be expanded using regex entities (see below).

Class NameDescriptionExample(s)
CREDIT_CARDCredit card information, including Credit Card number, expiration, and CVV."6504 8764 7593 8248"
EMAIL_ADDRESSAny email address to which email addresses can be delivered"[email protected]", "jane_doe [at] org [dot] com"
IBAN_CODEInternational Bank Account Number (IBAN)"FR650154264610QJGP3UHAJDJ02"
LOCLocation reference, including full and partial street addresses, city, state, and country names, coordinates, and landmarks."The United States", "Central Park", "123 Main St", "JFK"
ORGName of an organization, including companies and institutions."OpenAI", "OPEC", "SEC"
PASSPORTPassport number issued by any country."604876475", "Q24219489"
PERSONPerson’s full or partial name, including titles"Eric", "Jane Doe", "Parker"
PHONE_NUMBERTelephone or fax numbers"961-770-7727"
US_SSNUS Social Security Numbers"865-50-6891"

PII Policy Actions

You can manage what happens to inputs and outputs with detecting PII using the actions below:

  • Flag: flag content containing PII for moderator review, without blocking or modifying the data.
  • Block: block content containing PII
  • Redaction: redact PII before passing user inputs to the model or before displaying outputs to users.
    • example:
      • input: ‘My name is John Doe’
      • redacted input (passed to model): ‘My name is <PERSON>
  • Sanitization: anonymize PII before passing user inputs to the model and de-anonymize PII before displaying outputs to users
    • example:
      • input: ‘My name is John Doe and my brother’s name is Jacob Doe’
      • sanitized input (passed to model): ‘My name is <PERSON-1> and my brother’s name is <PERSON-2>
      • raw model response: ‘Hello <PERSON-1>, I hope you and <PERSON-2> are doing well’
      • de-sanitized response (given to user): ‘Hello John Doe, I hope you and Jacob Doe are doing well’

Custom Regex PII

Extend DynamoGuard's default PII Detection policy by adding regular expressions. To add additional PII entities to be detected using regex, define regex to be used for detection and a name for the entity.