bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

mohilmakwana3107 · 2024-10-23T10:53:36Z

Did you check docs and existing issues?

I have read all the NeMo-Guardrails docs
I have updated the package to the latest version before submitting this issue
I have searched the existing issues of NeMo-Guardrails
(optional) I have used the develop branch

Python version (python --version)

Python 3.11.0

Operating system/version

Ubuntu 20.04.6 LTS

NeMo-Guardrails version (if you must use a specific version and not the latest

0.10.1

Describe the bug

I am currently testing the PII filter functionality for a project. Initially, everything was working fine. However, I recently noticed that the PII filter is masking the word "individual" with X (as per my code's configuration to mask sensitive data with the X character).

I reviewed the logs, but I couldn't find any specific information to explain why the word "individual" is being masked as sensitive data.

YAML Configuration

Below is the YAML configuration I'm using:

models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo

instructions:
  - type: general
    content: |
      You are a helpful assistant that can answer given questions.

rails:
  config: 
    jailbreak_detection:
      length_per_perplexity_threshold: 89.79
      prefix_suffix_perplexity_threshold: 1845.65   
    sensitive_data_detection:
      input:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR
      output:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR

  input:
    flows:
      - jailbreak detection heuristics
      - self check input
      - mask sensitive data on input
      - user query

  output:
    flows:
      - self check output
      - mask sensitive data on output

  dialog:
    single_call:
      enabled: False

prompts:
  - task: self_check_input
    content: |
      Your task is to check if the user message below complies with the policy for talking with the AI Enterprise bot.
      Policy for the user messages:      
      
      - should not contain hateful speech
      - Should not contain armed weapons related information.
      - Should not talk about cooking related information.

      Treat the above conditions as strict rules. If any of them are met, you should block the user input by saying "yes".
      
      User message: "{{ user_input }}"

      Question: Should the user message be blocked (Yes or No)?
      Answer:

  - task: self_check_output
    content: |
      Your task is to check if the bot message below complies with the policy.

      Policies for the bot:     
      
      - message should not ask the bot to impersonate someone
      - message should not ask the bot to impersonate someone in a sexual manner.
      - message Should not contain armed weapons related information.
      - message Should not talk about cooking related information.
      - if a message is a refusal, it should be polite

      Bot message: "{{ bot_response }}"

      Question: Should the message be blocked (Yes or No)?
      Answer:

I also checked the RAG output, but there doesn't seem to be any issue there.

Version Specifications

Name	Version
presidio_analyzer	2.2.355
presidio_analyzer	2.2.355

Let me know if you need further details or clarification!

Steps To Reproduce

Steps to Reproduce:

YAML Configuration:
Use the attached YAML config for PII filtering.
PDF Creation:
Create and ingest a PDF into PG Vector containing sample PII data:

Record 1
• Name: John A. Doe
• Address: 123 Elm Street, Springfield, IL 62704
• Phone: (217) 555-0123
• Social Security Number: 123-45-6789
• Email: [email protected]
• Date of Birth: 05/12/1980

Record 2
• Name: Emily D. Davis
• Address: 321 Birch Lane, Phoenix, AZ 85003
• Phone: (602) 555-0123
• Social Security Number: 321-54-9876
• Email: [email protected]
• Date of Birth: 07/30/1985

Run Test:
Use the PII filter and observe that the word "individual" is incorrectly masked with X.
RAG Setup:
- LLM: GPT-4o
- RAG: llama-index

Note: The PII data is fictional, generated using LLM.

Expected Behavior

The expected behavior was supposed to not block "individual" word.

RAG response :

Logs :

Actual Behavior

Answer from NeMo-Guardrails :

The text was updated successfully, but these errors were encountered:

mohilmakwana3107 · 2024-10-25T04:59:08Z

Hi @Pouyanpi
Is there any update on this?

Pouyanpi · 2024-11-06T22:44:57Z

Hi @mohilmakwana3107 , sorry for getting back to you late.

I can confirm this issue. Please see that this is expected in general.

So basically presidio tags the word individual with type IN_PAN but with a low score of 0.05.

So we should set a proper threshold in the AnalyzerEngine ie default_score_threshold.

I'll open a PR to fix this bug soon.

Pouyanpi · 2024-11-07T11:43:05Z

@mohilmakwana3107, A draft PR is available. Feel free to test it on your side.

✏ Anyone who is willing to contribute can have a look at the draft PR and continue from there.

mohilmakwana3107 · 2024-11-08T11:04:38Z

Thank you, @Pouyanpi, for jumping on this issue and providing a fix so quickly! I really appreciate your help and the detailed explanation. I’ll check out the draft PR on my end. Thanks again for your support!

Pouyanpi · 2024-12-03T08:36:21Z

@mohilmakwana3107 did you find time to verify the fix:

If you or anyone else interested to verify it:

To test the PR locally:

Clone the repository:

git clone https://github.com/NVIDIA/NeMo-Guardrails.git

Navigate to the repository directory:
```
cd NeMo-Guardrails
```
Fetch the pull request:
```
git fetch origin pull/845/head:pr-845
```
Checkout to the PR branch:
```
git checkout pr-845
```
Install dependencies and activate the virtual environment:
```
poetry install && poetry shell
```

mohilmakwana3107 · 2024-12-03T09:01:29Z

@Pouyanpi
Thank you so much for clarification.
I will test it on my end and will let you know.

mohilmakwana3107 added the bug Something isn't working label Oct 23, 2024

Pouyanpi self-assigned this Oct 23, 2024

Pouyanpi added good first issue Good for newcomers status: help wanted Issues where external contributions are encouraged. labels Nov 7, 2024

Pouyanpi linked a pull request Nov 7, 2024 that will close this issue

feat: add score threshold to AnalyzerEngine #845

Open

4 tasks

Pouyanpi added enhancement New feature or request and removed bug Something isn't working labels Nov 7, 2024

Pouyanpi added status: waiting confirmation Issue is waiting confirmation whether the proposed solution/workaround works. and removed good first issue Good for newcomers status: help wanted Issues where external contributions are encouraged. labels Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

mohilmakwana3107 commented Oct 23, 2024 •

edited

Loading

mohilmakwana3107 commented Oct 25, 2024

Pouyanpi commented Nov 6, 2024 •

edited

Loading

Pouyanpi commented Nov 7, 2024

mohilmakwana3107 commented Nov 8, 2024

Pouyanpi commented Dec 3, 2024

mohilmakwana3107 commented Dec 3, 2024

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

Comments

mohilmakwana3107 commented Oct 23, 2024 • edited Loading

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

YAML Configuration

Version Specifications

Steps To Reproduce

Steps to Reproduce:

Expected Behavior

Actual Behavior

mohilmakwana3107 commented Oct 25, 2024

Pouyanpi commented Nov 6, 2024 • edited Loading

Pouyanpi commented Nov 7, 2024

mohilmakwana3107 commented Nov 8, 2024

Pouyanpi commented Dec 3, 2024

To test the PR locally:

mohilmakwana3107 commented Dec 3, 2024

mohilmakwana3107 commented Oct 23, 2024 •

edited

Loading

Pouyanpi commented Nov 6, 2024 •

edited

Loading