Commit graph

804 commits

Author SHA1 Message Date
Zheng Xu
a4bdb05b62 zCDP to epsilon for tree aggregation accounting.
PiperOrigin-RevId: 539706770
2023-06-12 11:09:14 -07:00
Walid Krichene
18c43b351b Support weighted losses in DPModel.
PiperOrigin-RevId: 538011437
2023-06-05 16:27:19 -07:00
Steve Chien
60d237be83 Update tensorflow-probability version to 0.20.0
PiperOrigin-RevId: 533550592
2023-05-19 14:22:24 -07:00
A. Unique TensorFlower
0f5acf868e Add additional tests and checks on the passed loss function.
PiperOrigin-RevId: 532225904
2023-05-15 14:27:46 -07:00
Shuang Song
8fdac5f833 Test DPModel in distributed training.
PiperOrigin-RevId: 528039164
2023-04-28 18:57:29 -07:00
Walid Krichene
e65e14b2d6 Fix bug in DPModel that shows up in distributed training.
PiperOrigin-RevId: 528026372
2023-04-28 17:31:18 -07:00
Galen Andrew
9710a4acc7 Bump version and update dependenciesfor pypi release.
PiperOrigin-RevId: 527377853
2023-04-26 14:35:24 -07:00
A. Unique TensorFlower
33bbc87ff2 Use better group privacy bound in computing user level privacy [TF Privacy]
PiperOrigin-RevId: 526852999
2023-04-24 22:17:24 -07:00
Michael Reneer
60cb0dd2fb Update tensorflow privacy to use NamedTuple instead of attrs.
This allows these objects to be traversed when nested in tree-like structures more easily.

PiperOrigin-RevId: 525532511
2023-04-19 13:18:25 -07:00
Shuang Song
e362f51773 Supports slicing for multi-label data.
PiperOrigin-RevId: 523846333
2023-04-12 17:14:11 -07:00
Galen Andrew
d5e41e20ad More detailed description of arguments in compute_dp_sgd_privacy.
PiperOrigin-RevId: 522693217
2023-04-07 15:07:35 -07:00
Shuang Song
c4628d5dbc Skips adding noise when noise_multiplier is 0 for fast clipping.
PiperOrigin-RevId: 522396275
2023-04-06 11:54:55 -07:00
Shuang Song
de9836883d Skips noise addition when noise_multiplier is 0. Fix a typo.
PiperOrigin-RevId: 521912964
2023-04-04 17:48:24 -07:00
A. Unique TensorFlower
ee1abe6930 Generalize generate_model_outputs_using_core_keras_layers().
This change adds the following two new features to the above function:
(i) it supports nested custom layers of depth >2;
(ii) it allows the caller to exclude certain layers from the expansion.

Feature (ii) will be needed for the development of DP models that use
Trasformer or BERT-type layers.

PiperOrigin-RevId: 520919934
2023-03-31 07:41:16 -07:00
Galen Andrew
abb0c3f9f6 Migrates compute_dp_sgd_privacy to print new privacy statement from compute_dp_sgd_privacy_lib.
PiperOrigin-RevId: 520147633
2023-03-28 15:14:12 -07:00
A. Unique TensorFlower
781483d1f2 Make compute_dp_sgd_privacy_statement visible.
PiperOrigin-RevId: 520105385
2023-03-28 12:43:21 -07:00
Shuang Song
e125951c9b Sets training set as positive class for sklearn.metrics.roc_curve.
sklearn.metrics.roc_curve uses classification rules in the form "score >= threshold ==> predict positive".
When calling roc_curve, we used to label test data as positive class. This way, TPR = % test examples classified as test, FPR = % training examples classified as test. The classification rule is "loss >= threshold ==> predict test".

For membership inference, TPR is usually defined as % training examples classified as training, and FPR is % test examples classified as training.
As training samples usually have lower loss, we usually use rules in the form of "loss <= threshold ==> predict training".

Therefore, TPR in the 2nd case is actually (1 - FPR) in the 1st case, FPR in the 2nd case is (1 - TPR) in the 1st case.
This mismatch does not affect attacker advantage or AUC, but this can cause problem to PPV.

Now, we:
- set training set as positive class.
- for threshold and entropy attacks, set score to be -loss, so that higher score corresponds to training data.
- negate the thresholds (computed based on -loss) so that it corresponds to loss.

PiperOrigin-RevId: 519880043
2023-03-27 18:00:25 -07:00
A. Unique TensorFlower
7796369d8b Support gradient norm computation with respect to a subset of variables.
PiperOrigin-RevId: 519245638
2023-03-24 14:57:54 -07:00
Galen Andrew
d5d60e2eac Adds compute_dp_sgd_privacy_statement for accurate privacy accounting report.
PiperOrigin-RevId: 518934979
2023-03-23 12:37:12 -07:00
Walid Krichene
52806ba952 In dp_optimizer_keras_sparse, update iterations to reflect the number of logical batches, rather than physical batches.
In the current behavior, when using gradient accumulation, the `iterations` variable is incremented at every physical batch, while variables are only updated at every logical batch (where logical batch = accumulation_steps many physical batches). This causes certain optimizers that explicitly depend on `iterations` (such as Adam) to behave very differently under gradient accumulation.

With this change, `iterations` is only incremented after each logical batch.

PiperOrigin-RevId: 517197044
2023-03-16 12:35:57 -07:00
A. Unique TensorFlower
7ae50c5ca5 Generalize model_forward_pass() to allow input models with multiple outputs.
PiperOrigin-RevId: 517145254
2023-03-16 09:36:16 -07:00
A. Unique TensorFlower
043e8b5272 Report the true loss in DPModel instead of the norm-adjusted loss.
PiperOrigin-RevId: 517112812
2023-03-16 07:15:13 -07:00
A. Unique TensorFlower
8f4ab1a8bb Allow custom per example loss functions for computing per microbatch gradient norm.
PiperOrigin-RevId: 516897864
2023-03-15 12:28:39 -07:00
Zheng Xu
d7d497bb69 Update script for pip package.
PiperOrigin-RevId: 515696284
2023-03-10 11:44:03 -08:00
Galen Andrew
c2bd4c3c6f Bump version number.
PiperOrigin-RevId: 515456888
2023-03-09 15:22:34 -08:00
Galen Andrew
701a585e1a Revert to dp-accounting 0.3.0 API.
PiperOrigin-RevId: 515432485
2023-03-09 13:56:34 -08:00
Galen Andrew
61dfbcc1f5 Adds functions for more accurate privacy accounting.
Adds function for computation of example-level DP epsilon taking into account microbatching and not assuming Poisson subsampling. Adds function for computation of user-level DP in terms of group privacy.

PiperOrigin-RevId: 515114010
2023-03-08 12:44:39 -08:00
A. Unique TensorFlower
4e1fc252e4 Add a kwargs argument to the registry API + small changes to docstrings.
This is a forward-looking change that is needed to support more complicated
layers, such as `tf.keras.layers.MultiHeadAttention`, which can take `kwargs`
as part of their `.call()` method and can generate arbitrary outputs.

PiperOrigin-RevId: 514775503
2023-03-07 10:35:04 -08:00
Steve Chien
21ee1a607a Fix unneeded dependency.
PiperOrigin-RevId: 514523996
2023-03-06 14:20:33 -08:00
Zheng Xu
0a0f377f3f Adaptive clipping in DP-FTRL with restart.
PiperOrigin-RevId: 513934548
2023-03-06 07:16:57 -08:00
A. Unique TensorFlower
8bfafdd74d Efficient DPSGD with support to microbatched losses.
PiperOrigin-RevId: 513886957
2023-03-06 07:01:03 -08:00
Walid Krichene
cbf34f2b04 Update type annotations of gradient clipping library.
PiperOrigin-RevId: 513640655
2023-03-02 14:29:17 -08:00
A. Unique TensorFlower
7436930c64 Improve documentation and logging of fast gradient clipping modules and callers.
PiperOrigin-RevId: 513283486
2023-03-01 10:56:01 -08:00
Andres Munoz MEdina
d7cd3f8af1 Add an announcement on the public README about the new fast implementation of DP-SGD.
PiperOrigin-RevId: 512930920
2023-02-28 07:45:47 -08:00
Shuang Song
a3e8a45559 Passes number of microbatches to DP model.
PiperOrigin-RevId: 512678620
2023-02-27 11:11:59 -08:00
Shuang Song
4a418e8862 Adds __init__.py for fast_gradient_clipping.
PiperOrigin-RevId: 512236191
2023-02-24 21:32:07 -08:00
A. Unique TensorFlower
dda7fa8b39 Add a tf.GradientTape argument to the layer registry functions
PiperOrigin-RevId: 512160655
2023-02-24 14:14:36 -08:00
Shuang Song
4dd8d0ffde Catches when data is not sufficient for StratifiedKFold split.
PiperOrigin-RevId: 510197242
2023-02-16 11:24:12 -08:00
Shuang Song
0c691d0b4d Returns None for getting max results when results are empty.
PiperOrigin-RevId: 510054673
2023-02-15 23:37:43 -08:00
A. Unique TensorFlower
13534e5159 Add better tests for clip_grads.py
PiperOrigin-RevId: 509529435
2023-02-14 08:01:56 -08:00
A. Unique TensorFlower
430f103354 Generalize the registry function for the embedding layer for other models.
PiperOrigin-RevId: 509528743
2023-02-14 07:59:10 -08:00
A. Unique TensorFlower
410814ec39 Generalize the internal API to allow for more general models + layers.
PiperOrigin-RevId: 509518753
2023-02-14 07:10:40 -08:00
Shuang Song
6ee988885a Fix a bug in get_flattened_attack_metrics that types, slices, metrics do not
correspond to values because of PPV.

PiperOrigin-RevId: 509274994
2023-02-13 10:53:29 -08:00
A. Unique TensorFlower
9ed34da715 Integrate the fast gradient clipping algorithm with the DP Keras Model class.
PiperOrigin-RevId: 504931452
2023-01-26 13:45:56 -08:00
A. Unique TensorFlower
bc84ed7bfb Add fast gradient clipping tests.
PiperOrigin-RevId: 504923799
2023-01-26 13:16:19 -08:00
A. Unique TensorFlower
a3b14ae20a First implementation of the fast gradient clipping algorithm.
PiperOrigin-RevId: 504668189
2023-01-25 14:51:09 -08:00
Steve Chien
ee3d349a8d Fix copybara removal of tkinter library.
PiperOrigin-RevId: 504656239
2023-01-25 14:06:27 -08:00
Yilei Yang
622282e034 Update dependency on tkinter.
PiperOrigin-RevId: 503401013
2023-01-20 03:24:46 -08:00
Thomas Steinke
10c086c46a Implementation of differentially private second order methods ("Newton's method") for research project.
PiperOrigin-RevId: 500821630
2023-01-09 15:22:37 -08:00
Peter Hawkins
3d038a490a [NumPy] Remove references to deprecated NumPy type aliases.
This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str).

NumPy 1.24 drops the deprecated aliases, so we must remove uses before updating NumPy.

PiperOrigin-RevId: 497194550
2022-12-22 10:32:59 -08:00