Skip to content

FarrandTom/deep-learning-fairness

Repository files navigation

Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy (CCS'20 Privacy-Preserving ML in Practice Workshop)

The paper discusses how Differential Privacy (specifically DPSGD from [1]) impacts model performance for underrepresented groups. We aim to study how different levels of imbalance in the data affect the accuracy and the fairness of the decisions made by the model, given different levels of privacy. We demonstrate how even small imbalances and loose privacy guarantees can cause disparate impacts.

Usage

Configure environment by running: pip install -r requirements.txt
We use Python3.7 and GPU Nvidia TitanX.
File playing.py serves as the entry point for the code. It uses utils/params.yaml to set parameters from the paper and builds a graph on Tensorboard.
For Sentiment prediction we use playing_nlp.py.

Datasets

  1. MNIST (part of PyTorch)
  2. Diversity in Faces (obtained from IBM here)
  3. iNaturalist (download from here)
  4. UTKFace (from here)
  5. AAE Twitter corpus (from here)

Code Sources

We use compute_dp_sgd_privacy.py copied from public repo.

DP-FedAvg implementation is taken from public repo.

Implementation of DPSGD is based on TF Privacy repo and papers:

Paper

https://arxiv.org/pdf/2009.06389.pdf

Citation

@article{farrand2020neither, title={Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy}, author={Farrand, Tom and Mireshghallah, Fatemehsadat and Singh, Sahib and Trask, Andrew}, journal={arXiv preprint arXiv:2009.06389}, year={2020} }

=======

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •