tensorflow_privacy/research/mi_lira_2021/README.md

## Membership Inference Attacks From First Principles

This directory contains code to reproduce our paper:

**"Membership Inference Attacks From First Principles"** <br>
https://arxiv.org/abs/2112.03570 <br>
by Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramèr.

### INSTALLING

You will need to install fairly standard dependencies

`pip install scipy, sklearn, numpy, matplotlib`

and also some machine learning framework to train models. We train our models
with JAX + ObJAX so you will need to follow build instructions for that
https://github.com/google/objax
https://objax.readthedocs.io/en/latest/installation_setup.html

### RUNNING THE CODE

#### 1. Train the models

The first step in our attack is to train shadow models. As a baseline that
should give most of the gains in our attack, you should start by training 16
shadow models with the command

> bash scripts/train_demo.sh

or if you have multiple GPUs on your machine and want to train these models in
parallel, then modify and run

> bash scripts/train_demo_multigpu.sh

This will train several CIFAR-10 wide ResNet models to ~91% accuracy each, and
will output a bunch of files under the directory exp/cifar10 with structure:

```
exp/cifar10/
- experiment_N_of_16
-- hparams.json
-- keep.npy
-- ckpt/
--- 0000000100.npz
-- tb/
```

#### 2. Perform inference

Once the models are trained, now it's necessary to perform inference and save
the output features for each training example for each model in the dataset.

> python3 inference.py --logdir=exp/cifar10/

This will add to the experiment directory a new set of files

```
exp/cifar10/
- experiment_N_of_16
-- logits/
--- 0000000100.npy
```

where this new file has shape (50000, 10) and stores the model's output features
for each example.

#### 3. Compute membership inference scores

Finally we take the output features and generate our logit-scaled membership
inference scores for each example for each model.

> python3 score.py exp/cifar10/

And this in turn generates a new directory

```
exp/cifar10/
- experiment_N_of_16
-- scores/
--- 0000000100.npy
```

with shape (50000,) storing just our scores.

### PLOTTING THE RESULTS

Finally we can generate pretty pictures, and run the plotting code

> python3 plot.py

which should give (something like) the following output

![Log-log ROC Curve for all attacks](fprtpr.png "Log-log ROC Curve")

```
Attack Ours (online)
   AUC 0.6676, Accuracy 0.6077, TPR@0.1%FPR of 0.0169
Attack Ours (online, fixed variance)
   AUC 0.6856, Accuracy 0.6137, TPR@0.1%FPR of 0.0593
Attack Ours (offline)
   AUC 0.5488, Accuracy 0.5500, TPR@0.1%FPR of 0.0130
Attack Ours (offline, fixed variance)
   AUC 0.5549, Accuracy 0.5537, TPR@0.1%FPR of 0.0299
Attack Global threshold
   AUC 0.5921, Accuracy 0.6044, TPR@0.1%FPR of 0.0009
```

where the global threshold attack is the baseline, and our online,
online-with-fixed-variance, offline, and offline-with-fixed-variance attack
variants are the four other curves. Note that because we only train a few
models, the fixed variance variants perform best.

### Citation

You can cite this paper with

```
@article{carlini2021membership,
  title={Membership Inference Attacks From First Principles},
  author={Carlini, Nicholas and Chien, Steve and Nasr, Milad and Song, Shuang and Terzis, Andreas and Tramer, Florian},
  journal={arXiv preprint arXiv:2112.03570},
  year={2021}
}
```
Prettier README 2021-12-13 17:54:29 -07:00			`## Membership Inference Attacks From First Principles`

Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			`This directory contains code to reproduce our paper:`

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`"Membership Inference Attacks From First Principles" <br>`
			`https://arxiv.org/abs/2112.03570 <br>`
			`by Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramèr.`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
Prettier README 2021-12-13 17:54:29 -07:00			`### INSTALLING`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`You will need to install fairly standard dependencies`

			`pip install scipy, sklearn, numpy, matplotlib`

			`and also some machine learning framework to train models. We train our models`
			`with JAX + ObJAX so you will need to follow build instructions for that`
			`https://github.com/google/objax`
			`https://objax.readthedocs.io/en/latest/installation_setup.html`

Prettier README 2021-12-13 17:54:29 -07:00			`### RUNNING THE CODE`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
Prettier README 2021-12-13 17:54:29 -07:00			`#### 1. Train the models`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`The first step in our attack is to train shadow models. As a baseline that`
			`should give most of the gains in our attack, you should start by training 16`
			`shadow models with the command`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`> bash scripts/train_demo.sh`

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`or if you have multiple GPUs on your machine and want to train these models in`
			`parallel, then modify and run`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`> bash scripts/train_demo_multigpu.sh`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`This will train several CIFAR-10 wide ResNet models to ~91% accuracy each, and`
			`will output a bunch of files under the directory exp/cifar10 with structure:`

			```
			`exp/cifar10/`
			`- experiment_N_of_16`
			`-- hparams.json`
			`-- keep.npy`
			`-- ckpt/`
			`--- 0000000100.npz`
			`-- tb/`
			```

Prettier README 2021-12-13 17:54:29 -07:00			`#### 2. Perform inference`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`Once the models are trained, now it's necessary to perform inference and save`
			`the output features for each training example for each model in the dataset.`

			`> python3 inference.py --logdir=exp/cifar10/`

			`This will add to the experiment directory a new set of files`

			```
			`exp/cifar10/`
			`- experiment_N_of_16`
			`-- logits/`
			`--- 0000000100.npy`
			```

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`where this new file has shape (50000, 10) and stores the model's output features`
			`for each example.`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
Prettier README 2021-12-13 17:54:29 -07:00			`#### 3. Compute membership inference scores`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`Finally we take the output features and generate our logit-scaled membership`
			`inference scores for each example for each model.`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`> python3 score.py exp/cifar10/`

			`And this in turn generates a new directory`

			```
			`exp/cifar10/`
			`- experiment_N_of_16`
			`-- scores/`
			`--- 0000000100.npy`
			```

			`with shape (50000,) storing just our scores.`

Prettier README 2021-12-13 17:54:29 -07:00			`### PLOTTING THE RESULTS`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00
			`Finally we can generate pretty pictures, and run the plotting code`

			`> python3 plot.py`

			`which should give (something like) the following output`

			`![Log-log ROC Curve for all attacks](fprtpr.png "Log-log ROC Curve")`

			```
			`Attack Ours (online)`
Fixes comments and membership scores for thresholds attack. PiperOrigin-RevId: 555579896 2023-08-10 12:29:52 -06:00			`AUC 0.6676, Accuracy 0.6077, TPR@0.1%FPR of 0.0169`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			`Attack Ours (online, fixed variance)`
Fixes comments and membership scores for thresholds attack. PiperOrigin-RevId: 555579896 2023-08-10 12:29:52 -06:00			`AUC 0.6856, Accuracy 0.6137, TPR@0.1%FPR of 0.0593`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			`Attack Ours (offline)`
Fixes comments and membership scores for thresholds attack. PiperOrigin-RevId: 555579896 2023-08-10 12:29:52 -06:00			`AUC 0.5488, Accuracy 0.5500, TPR@0.1%FPR of 0.0130`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			`Attack Ours (offline, fixed variance)`
Fixes comments and membership scores for thresholds attack. PiperOrigin-RevId: 555579896 2023-08-10 12:29:52 -06:00			`AUC 0.5549, Accuracy 0.5537, TPR@0.1%FPR of 0.0299`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			`Attack Global threshold`
Fixes comments and membership scores for thresholds attack. PiperOrigin-RevId: 555579896 2023-08-10 12:29:52 -06:00			`AUC 0.5921, Accuracy 0.6044, TPR@0.1%FPR of 0.0009`
Add code to reproduce Membership Inference Attacks From First Principles 2021-12-13 17:50:49 -07:00			```

			`where the global threshold attack is the baseline, and our online,`
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			`online-with-fixed-variance, offline, and offline-with-fixed-variance attack`
			`variants are the four other curves. Note that because we only train a few`
			`models, the fixed variance variants perform best.`
Prettier README 2021-12-13 17:54:29 -07:00
			`### Citation`

			`You can cite this paper with`

			```
			`@article{carlini2021membership,`
			`title={Membership Inference Attacks From First Principles},`
			`author={Carlini, Nicholas and Chien, Steve and Nasr, Milad and Song, Shuang and Terzis, Andreas and Tramer, Florian},`
			`journal={arXiv preprint arXiv:2112.03570},`
			`year={2021}`
			`}`
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/privacy/pull/234 from ftramer:truth_serum fe44a0713952ef1615abf032947082eb5c082836 PiperOrigin-RevId: 447573314 2022-05-09 16:04:33 -06:00			```