Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

Open
3 of 4 tasks
mohilmakwana3107 opened this issue Oct 23, 2024 · 6 comments · May be fixed by #845
Open
3 of 4 tasks

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

mohilmakwana3107 opened this issue Oct 23, 2024 · 6 comments · May be fixed by #845
Assignees
Labels
enhancement New feature or request status: waiting confirmation Issue is waiting confirmation whether the proposed solution/workaround works.

Comments

@mohilmakwana3107
Copy link

mohilmakwana3107 commented Oct 23, 2024

Did you check docs and existing issues?

  • I have read all the NeMo-Guardrails docs
  • I have updated the package to the latest version before submitting this issue
  • I have searched the existing issues of NeMo-Guardrails
  • (optional) I have used the develop branch

Python version (python --version)

Python 3.11.0

Operating system/version

Ubuntu 20.04.6 LTS

NeMo-Guardrails version (if you must use a specific version and not the latest

0.10.1

Describe the bug

I am currently testing the PII filter functionality for a project. Initially, everything was working fine. However, I recently noticed that the PII filter is masking the word "individual" with X (as per my code's configuration to mask sensitive data with the X character).

I reviewed the logs, but I couldn't find any specific information to explain why the word "individual" is being masked as sensitive data.

YAML Configuration

Below is the YAML configuration I'm using:

models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo

instructions:
  - type: general
    content: |
      You are a helpful assistant that can answer given questions.

rails:
  config: 
    jailbreak_detection:
      length_per_perplexity_threshold: 89.79
      prefix_suffix_perplexity_threshold: 1845.65   
    sensitive_data_detection:
      input:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR
      output:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR

  input:
    flows:
      - jailbreak detection heuristics
      - self check input
      - mask sensitive data on input
      - user query

  output:
    flows:
      - self check output
      - mask sensitive data on output

  dialog:
    single_call:
      enabled: False

prompts:
  - task: self_check_input
    content: |
      Your task is to check if the user message below complies with the policy for talking with the AI Enterprise bot.
      Policy for the user messages:      
      
      - should not contain hateful speech
      - Should not contain armed weapons related information.
      - Should not talk about cooking related information.

      Treat the above conditions as strict rules. If any of them are met, you should block the user input by saying "yes".
      
      User message: "{{ user_input }}"

      Question: Should the user message be blocked (Yes or No)?
      Answer:

  - task: self_check_output
    content: |
      Your task is to check if the bot message below complies with the policy.

      Policies for the bot:     
      
      - message should not ask the bot to impersonate someone
      - message should not ask the bot to impersonate someone in a sexual manner.
      - message Should not contain armed weapons related information.
      - message Should not talk about cooking related information.
      - if a message is a refusal, it should be polite

      Bot message: "{{ bot_response }}"

      Question: Should the message be blocked (Yes or No)?
      Answer:

I also checked the RAG output, but there doesn't seem to be any issue there.

Version Specifications

Name Version
presidio_analyzer 2.2.355
presidio_analyzer 2.2.355

Let me know if you need further details or clarification!

Steps To Reproduce

Steps to Reproduce:

  1. YAML Configuration:
    Use the attached YAML config for PII filtering.

  2. PDF Creation:
    Create and ingest a PDF into PG Vector containing sample PII data:

Record 1
• Name: John A. Doe
• Address: 123 Elm Street, Springfield, IL 62704
• Phone: (217) 555-0123
• Social Security Number: 123-45-6789
• Email: [email protected]
• Date of Birth: 05/12/1980

Record 2
• Name: Emily D. Davis
• Address: 321 Birch Lane, Phoenix, AZ 85003
• Phone: (602) 555-0123
• Social Security Number: 321-54-9876
• Email: [email protected]
• Date of Birth: 07/30/1985
  1. Run Test:
    Use the PII filter and observe that the word "individual" is incorrectly masked with X.

  2. RAG Setup:

    • LLM: GPT-4o
    • RAG: llama-index

Note: The PII data is fictional, generated using LLM.

Expected Behavior

The expected behavior was supposed to not block "individual" word.

RAG response :
Image


Logs :

Image

Image

Actual Behavior

Answer from NeMo-Guardrails :
Image

@mohilmakwana3107 mohilmakwana3107 added the bug Something isn't working label Oct 23, 2024
@Pouyanpi Pouyanpi self-assigned this Oct 23, 2024
@mohilmakwana3107
Copy link
Author

Hi @Pouyanpi
Is there any update on this?

@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Nov 6, 2024

Hi @mohilmakwana3107 , sorry for getting back to you late.

I can confirm this issue. Please see that this is expected in general.

So basically presidio tags the word individual with type IN_PAN but with a low score of 0.05.

So we should set a proper threshold in the AnalyzerEngine ie default_score_threshold.

I'll open a PR to fix this bug soon.

@Pouyanpi Pouyanpi added good first issue Good for newcomers status: help wanted Issues where external contributions are encouraged. labels Nov 7, 2024
@Pouyanpi Pouyanpi linked a pull request Nov 7, 2024 that will close this issue
4 tasks
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Nov 7, 2024

@mohilmakwana3107, A draft PR is available. Feel free to test it on your side.

✏ Anyone who is willing to contribute can have a look at the draft PR and continue from there.

@Pouyanpi Pouyanpi added enhancement New feature or request and removed bug Something isn't working labels Nov 7, 2024
@mohilmakwana3107
Copy link
Author

Thank you, @Pouyanpi, for jumping on this issue and providing a fix so quickly! I really appreciate your help and the detailed explanation. I’ll check out the draft PR on my end. Thanks again for your support!

@Pouyanpi Pouyanpi added status: waiting confirmation Issue is waiting confirmation whether the proposed solution/workaround works. and removed good first issue Good for newcomers status: help wanted Issues where external contributions are encouraged. labels Nov 18, 2024
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Dec 3, 2024

@mohilmakwana3107 did you find time to verify the fix:

If you or anyone else interested to verify it:

To test the PR locally:

  1. Clone the repository:
    git clone https://github.com/NVIDIA/NeMo-Guardrails.git
  2. Navigate to the repository directory:
    cd NeMo-Guardrails
  3. Fetch the pull request:
    git fetch origin pull/845/head:pr-845
  4. Checkout to the PR branch:
    git checkout pr-845
  5. Install dependencies and activate the virtual environment:
    poetry install && poetry shell

@mohilmakwana3107
Copy link
Author

@Pouyanpi
Thank you so much for clarification.
I will test it on my end and will let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status: waiting confirmation Issue is waiting confirmation whether the proposed solution/workaround works.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants