Commit graph

858 commits

Author SHA1 Message Date
Michael Reneer
5b21aad36e Update the scipy dependency to 1.9.
PiperOrigin-RevId: 543452894
2023-06-26 08:59:44 -07:00
Vadym Doroshenko
45da453410 Implement possibility to return slice indices.
PiperOrigin-RevId: 540885025
2023-06-16 08:22:43 -07:00
Edoardo Debenedetti
6301f3ffef
Merge branch 'tensorflow:master' into master 2023-06-13 15:09:21 +02:00
Edoardo Debenedetti
ab4cb09399 Fix LiRA inference 2023-06-13 09:57:46 +02:00
Zheng Xu
a4bdb05b62 zCDP to epsilon for tree aggregation accounting.
PiperOrigin-RevId: 539706770
2023-06-12 11:09:14 -07:00
Walid Krichene
18c43b351b Support weighted losses in DPModel.
PiperOrigin-RevId: 538011437
2023-06-05 16:27:19 -07:00
Steve Chien
60d237be83 Update tensorflow-probability version to 0.20.0
PiperOrigin-RevId: 533550592
2023-05-19 14:22:24 -07:00
A. Unique TensorFlower
0f5acf868e Add additional tests and checks on the passed loss function.
PiperOrigin-RevId: 532225904
2023-05-15 14:27:46 -07:00
Shuang Song
8fdac5f833 Test DPModel in distributed training.
PiperOrigin-RevId: 528039164
2023-04-28 18:57:29 -07:00
Walid Krichene
e65e14b2d6 Fix bug in DPModel that shows up in distributed training.
PiperOrigin-RevId: 528026372
2023-04-28 17:31:18 -07:00
Galen Andrew
9710a4acc7 Bump version and update dependenciesfor pypi release.
PiperOrigin-RevId: 527377853
2023-04-26 14:35:24 -07:00
A. Unique TensorFlower
33bbc87ff2 Use better group privacy bound in computing user level privacy [TF Privacy]
PiperOrigin-RevId: 526852999
2023-04-24 22:17:24 -07:00
Michael Reneer
60cb0dd2fb Update tensorflow privacy to use NamedTuple instead of attrs.
This allows these objects to be traversed when nested in tree-like structures more easily.

PiperOrigin-RevId: 525532511
2023-04-19 13:18:25 -07:00
Shuang Song
e362f51773 Supports slicing for multi-label data.
PiperOrigin-RevId: 523846333
2023-04-12 17:14:11 -07:00
Galen Andrew
d5e41e20ad More detailed description of arguments in compute_dp_sgd_privacy.
PiperOrigin-RevId: 522693217
2023-04-07 15:07:35 -07:00
Shuang Song
c4628d5dbc Skips adding noise when noise_multiplier is 0 for fast clipping.
PiperOrigin-RevId: 522396275
2023-04-06 11:54:55 -07:00
Shuang Song
de9836883d Skips noise addition when noise_multiplier is 0. Fix a typo.
PiperOrigin-RevId: 521912964
2023-04-04 17:48:24 -07:00
A. Unique TensorFlower
ee1abe6930 Generalize generate_model_outputs_using_core_keras_layers().
This change adds the following two new features to the above function:
(i) it supports nested custom layers of depth >2;
(ii) it allows the caller to exclude certain layers from the expansion.

Feature (ii) will be needed for the development of DP models that use
Trasformer or BERT-type layers.

PiperOrigin-RevId: 520919934
2023-03-31 07:41:16 -07:00
Galen Andrew
abb0c3f9f6 Migrates compute_dp_sgd_privacy to print new privacy statement from compute_dp_sgd_privacy_lib.
PiperOrigin-RevId: 520147633
2023-03-28 15:14:12 -07:00
A. Unique TensorFlower
781483d1f2 Make compute_dp_sgd_privacy_statement visible.
PiperOrigin-RevId: 520105385
2023-03-28 12:43:21 -07:00
Shuang Song
e125951c9b Sets training set as positive class for sklearn.metrics.roc_curve.
sklearn.metrics.roc_curve uses classification rules in the form "score >= threshold ==> predict positive".
When calling roc_curve, we used to label test data as positive class. This way, TPR = % test examples classified as test, FPR = % training examples classified as test. The classification rule is "loss >= threshold ==> predict test".

For membership inference, TPR is usually defined as % training examples classified as training, and FPR is % test examples classified as training.
As training samples usually have lower loss, we usually use rules in the form of "loss <= threshold ==> predict training".

Therefore, TPR in the 2nd case is actually (1 - FPR) in the 1st case, FPR in the 2nd case is (1 - TPR) in the 1st case.
This mismatch does not affect attacker advantage or AUC, but this can cause problem to PPV.

Now, we:
- set training set as positive class.
- for threshold and entropy attacks, set score to be -loss, so that higher score corresponds to training data.
- negate the thresholds (computed based on -loss) so that it corresponds to loss.

PiperOrigin-RevId: 519880043
2023-03-27 18:00:25 -07:00
A. Unique TensorFlower
7796369d8b Support gradient norm computation with respect to a subset of variables.
PiperOrigin-RevId: 519245638
2023-03-24 14:57:54 -07:00
Galen Andrew
d5d60e2eac Adds compute_dp_sgd_privacy_statement for accurate privacy accounting report.
PiperOrigin-RevId: 518934979
2023-03-23 12:37:12 -07:00
Walid Krichene
52806ba952 In dp_optimizer_keras_sparse, update iterations to reflect the number of logical batches, rather than physical batches.
In the current behavior, when using gradient accumulation, the `iterations` variable is incremented at every physical batch, while variables are only updated at every logical batch (where logical batch = accumulation_steps many physical batches). This causes certain optimizers that explicitly depend on `iterations` (such as Adam) to behave very differently under gradient accumulation.

With this change, `iterations` is only incremented after each logical batch.

PiperOrigin-RevId: 517197044
2023-03-16 12:35:57 -07:00
A. Unique TensorFlower
7ae50c5ca5 Generalize model_forward_pass() to allow input models with multiple outputs.
PiperOrigin-RevId: 517145254
2023-03-16 09:36:16 -07:00
A. Unique TensorFlower
043e8b5272 Report the true loss in DPModel instead of the norm-adjusted loss.
PiperOrigin-RevId: 517112812
2023-03-16 07:15:13 -07:00
A. Unique TensorFlower
8f4ab1a8bb Allow custom per example loss functions for computing per microbatch gradient norm.
PiperOrigin-RevId: 516897864
2023-03-15 12:28:39 -07:00
Zheng Xu
d7d497bb69 Update script for pip package.
PiperOrigin-RevId: 515696284
2023-03-10 11:44:03 -08:00
Galen Andrew
c2bd4c3c6f Bump version number.
PiperOrigin-RevId: 515456888
2023-03-09 15:22:34 -08:00
Galen Andrew
701a585e1a Revert to dp-accounting 0.3.0 API.
PiperOrigin-RevId: 515432485
2023-03-09 13:56:34 -08:00
Galen Andrew
61dfbcc1f5 Adds functions for more accurate privacy accounting.
Adds function for computation of example-level DP epsilon taking into account microbatching and not assuming Poisson subsampling. Adds function for computation of user-level DP in terms of group privacy.

PiperOrigin-RevId: 515114010
2023-03-08 12:44:39 -08:00
A. Unique TensorFlower
4e1fc252e4 Add a kwargs argument to the registry API + small changes to docstrings.
This is a forward-looking change that is needed to support more complicated
layers, such as `tf.keras.layers.MultiHeadAttention`, which can take `kwargs`
as part of their `.call()` method and can generate arbitrary outputs.

PiperOrigin-RevId: 514775503
2023-03-07 10:35:04 -08:00
Steve Chien
21ee1a607a Fix unneeded dependency.
PiperOrigin-RevId: 514523996
2023-03-06 14:20:33 -08:00
Zheng Xu
0a0f377f3f Adaptive clipping in DP-FTRL with restart.
PiperOrigin-RevId: 513934548
2023-03-06 07:16:57 -08:00
A. Unique TensorFlower
8bfafdd74d Efficient DPSGD with support to microbatched losses.
PiperOrigin-RevId: 513886957
2023-03-06 07:01:03 -08:00
Walid Krichene
cbf34f2b04 Update type annotations of gradient clipping library.
PiperOrigin-RevId: 513640655
2023-03-02 14:29:17 -08:00
A. Unique TensorFlower
7436930c64 Improve documentation and logging of fast gradient clipping modules and callers.
PiperOrigin-RevId: 513283486
2023-03-01 10:56:01 -08:00
Andres Munoz MEdina
d7cd3f8af1 Add an announcement on the public README about the new fast implementation of DP-SGD.
PiperOrigin-RevId: 512930920
2023-02-28 07:45:47 -08:00
Shuang Song
a3e8a45559 Passes number of microbatches to DP model.
PiperOrigin-RevId: 512678620
2023-02-27 11:11:59 -08:00
Shuang Song
4a418e8862 Adds __init__.py for fast_gradient_clipping.
PiperOrigin-RevId: 512236191
2023-02-24 21:32:07 -08:00
A. Unique TensorFlower
dda7fa8b39 Add a tf.GradientTape argument to the layer registry functions
PiperOrigin-RevId: 512160655
2023-02-24 14:14:36 -08:00
Shuang Song
4dd8d0ffde Catches when data is not sufficient for StratifiedKFold split.
PiperOrigin-RevId: 510197242
2023-02-16 11:24:12 -08:00
Shuang Song
0c691d0b4d Returns None for getting max results when results are empty.
PiperOrigin-RevId: 510054673
2023-02-15 23:37:43 -08:00
A. Unique TensorFlower
13534e5159 Add better tests for clip_grads.py
PiperOrigin-RevId: 509529435
2023-02-14 08:01:56 -08:00
A. Unique TensorFlower
430f103354 Generalize the registry function for the embedding layer for other models.
PiperOrigin-RevId: 509528743
2023-02-14 07:59:10 -08:00
A. Unique TensorFlower
410814ec39 Generalize the internal API to allow for more general models + layers.
PiperOrigin-RevId: 509518753
2023-02-14 07:10:40 -08:00
Shuang Song
6ee988885a Fix a bug in get_flattened_attack_metrics that types, slices, metrics do not
correspond to values because of PPV.

PiperOrigin-RevId: 509274994
2023-02-13 10:53:29 -08:00
A. Unique TensorFlower
9ed34da715 Integrate the fast gradient clipping algorithm with the DP Keras Model class.
PiperOrigin-RevId: 504931452
2023-01-26 13:45:56 -08:00
A. Unique TensorFlower
bc84ed7bfb Add fast gradient clipping tests.
PiperOrigin-RevId: 504923799
2023-01-26 13:16:19 -08:00
A. Unique TensorFlower
a3b14ae20a First implementation of the fast gradient clipping algorithm.
PiperOrigin-RevId: 504668189
2023-01-25 14:51:09 -08:00