Skip to main content

PII Detection and Data Masking

Want to ensure (1) users don't enter Personally Identifiable Information (PII) into the chatbot and that (2) PII is excluded from the knowledge sources? On the kapa.ai platform you can now enable PII detection and data masking to ensure that PII is never stored or shared.

(1) PII Detection in User Messages

If PII is detected in a message, the chatbot will not generate an answer and will instead notify the user that PII exists in their message and will suggest them to try again.

Features

  • Real-time PII Scanning: As soon as a message is received, it's scanned for any PII.
  • No Data Storage: If PII is detected, not only is the message rejected, but it is also not stored, ensuring user data privacy.
  • Support for Multiple PII Types: kapa.ai can detect a wide range of PII types.
  • Integration with API, Slack, Discord and more: We've extended our PII detection capabilities to support all integration types.

Enabling PII Detection

To enable the PII detection feature:

  1. Reach out to the kapa team.
  2. Specify which types of PII you'd like to detect.
  3. The kapa team will configure the feature for you.

(2) PII Masking of Knowledge Sources

If PII is detected in the content of a document which is crawled by kapa, users have the option to anonymize that information. This is configurable on the source-level.

Features

  • No Data Storage: kapa.ai will "mask" detected PII with an anonymized label. That way sensitive data will never be visible, which also ensures that this information is never directly mentioned in an answer generated by kapa.
  • Support for Multiple PII Types: kapa.ai can detect a wide range of PII types.
PII Masking

Enabling PII Masking

To enable the PII Masking feature for a specific source:

  1. Navigate to the Sources screen on the kapa.ai dashboard.
  2. Click on the three-dot menu, at the very right of a given source and then click Configure.
Three-Dot-Menu
  1. Specify which types of PII you'd like to mask.
PII Config
  1. Save your changes.

Note that the PII detection will only apply once the source is refreshed.

FAQ: Supported PII Types

You have the flexibility to enable or disable specific types of PII detection as per your requirements.

  • Phone numbers
  • Credit Cards
  • Email Addresses

Other Supported PII Types

  • IBAN Code
  • IP Address
  • US Bank Number
  • US Driver License
  • US ITIN
  • US Passport
  • US SSN
  • Location
  • Person
  • Medical License
  • URL
  • Nationality, religious or political group
  • Bitcoin Addresses
  • Date and Time
URL Scanning

Scanning for URLs can be a part of PII detection, but it's optional. Depending on your product's use case, you might want to disable this option. For instance, if your users often share legitimate URLs as part of their queries, disabling this might be beneficial.