PII Detection and Data Masking
Want to ensure (1) users don't enter Personally Identifiable Information (PII) into the chatbot and that (2) PII is excluded from the knowledge sources? On the kapa.ai platform you can now enable PII detection and data masking to ensure that PII is never stored or shared.
(1) PII Detection in User Messages
If PII is detected in a message, the chatbot will not generate an answer and will instead notify the user that PII exists in their message and will suggest them to try again.
Features
- Real-time PII Scanning: As soon as a message is received, it's scanned for any PII.
- No Data Storage: If PII is detected, not only is the message rejected, but it is also not stored, ensuring user data privacy.
- Support for Multiple PII Types: kapa.ai can detect a wide range of PII types.
- Integration with API, Slack, Discord and more: We've extended our PII detection capabilities to support all integration types.
Enabling PII Detection
To enable the PII detection feature:
- Reach out to the kapa team.
- Specify which types of PII you'd like to detect.
- The kapa team will configure the feature for you.
(2) PII Masking of Knowledge Sources
If PII is detected in the content of a document which is crawled by kapa, users have the option to anonymize that information. This is configurable on the source-level.
Features
- No Data Storage: kapa.ai will "mask" detected PII with an anonymized label. That way sensitive data will never be visible, which also ensures that this information is never directly mentioned in an answer generated by kapa.
- Support for Multiple PII Types: kapa.ai can detect a wide range of PII types.
Enabling PII Masking
To enable the PII Masking feature for a specific source:
- Navigate to the
Sources
screen on the kapa.ai dashboard. - Click on the three-dot menu, at the very right of a given source and then click
Configure
.
- Specify which types of PII you'd like to mask.
- Save your changes.
Note that the PII detection will only apply once the source is refreshed.
FAQ: Supported PII Types
You have the flexibility to enable or disable specific types of PII detection as per your requirements.
Recommended PII Types
- Phone numbers
- Credit Cards
- Email Addresses
Other Supported PII Types
- IBAN Code
- IP Address
- US Bank Number
- US Driver License
- US ITIN
- US Passport
- US SSN
- Location
- Person
- Medical License
- URL
- Nationality, religious or political group
- Bitcoin Addresses
- Date and Time
Scanning for URLs can be a part of PII detection, but it's optional. Depending on your product's use case, you might want to disable this option. For instance, if your users often share legitimate URLs as part of their queries, disabling this might be beneficial.