Skip to content

RensselaerIDEA/SyntheticDataFairness

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

SyntheticDataFairness

The repository includes the code for metrics designed for evaluating fairness of synthetic healthcare datasets and how they are applied for fairness quantification. This repository is the official repository for the paper: The Problem of Fairness in Synthetic Healthcare Data.

Repository structure

  • data: The folder includes data for two datasets. Atus is the American Time Use Survey dataset, both the derived real and synthetic data files. Mimic is the MIMIC-III dataset based on a past study for identifying the impact of race on mortality and includes only the synthetic dataset. Note that the synthetic datasets are generated using a Generative Adversarial Network (GAN) model called HealthGAN and are intended to not release any private information of the real datasets.
  • scripts: The scripts include code snippets which are used in multiple other files or notebooks and hence, have been designed to be imported as functions.
  • notebooks: The notebooks include code for plotting figures and calculating metrics on the datatsets.
  • results: The results for the log disparity metric on synthetic datasets is compiled into CSV files included in this folder.

Data files description

Contacts

For questions, please reach out to Karan Bhanot (bhanok@rpi.edu or bhanotkaran22@gmail.com).

About

Supporting the synthetic data fairness research and paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published