Update narrative content.
PiperOrigin-RevId: 394558889
This commit is contained in:
parent
fc7504efca
commit
b7249e6ab2
3 changed files with 39 additions and 29 deletions
|
@ -13,16 +13,21 @@ landing_page:
|
|||
- classname: devsite-landing-row-50
|
||||
description: >
|
||||
<p>
|
||||
Preventing ML models from exposing potentially sensitive information is a critical part of
|
||||
using AI responsibly. To that end, <i>differentially private stochastic gradient descent
|
||||
(DP-SGD)</i> is a modification to the standard stochastic gradient descent (SGD) algorithm
|
||||
in machine learning. </p>
|
||||
<p>Models trained with DP-SGD have provable differential privacy (DP)
|
||||
guarantees, mitigating the risk of exposing sensitive training data. Intuitively, a model
|
||||
trained with differential privacy should not be affected by any single training example in
|
||||
its data set. DP-SGD techniques can also be used in federated learning to provide user-level
|
||||
differential privacy. You can learn more about differentially private deep learning in <a
|
||||
href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>.
|
||||
An important aspect of responsible AI usage is ensuring that ML models are prevented from
|
||||
exposing potentially sensitive information, such as demographic information or other
|
||||
attributes in the training dataset that could be used to identify people.
|
||||
One way to achieve this is by using differentially private stochastic gradient descent
|
||||
(DP-SGD), which is a modification to the standard stochastic gradient descent (SGD)
|
||||
algorithm in machine learning.
|
||||
</p>
|
||||
<p>
|
||||
Models trained with DP-SGD have measurable differential privacy (DP) improvements, which
|
||||
helps mitigate the risk of exposing sensitive training data. Since the purpose of DP is
|
||||
to help prevent individual data points from being identified, a model trained with DP
|
||||
should not be affected by any single training example in its training data set. DP-SGD
|
||||
techniques can also be used in federated learning to provide user-level differential privacy.
|
||||
You can learn more about differentially private deep learning in
|
||||
<a href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>.
|
||||
</p>
|
||||
|
||||
- code_block: |
|
||||
|
@ -58,14 +63,19 @@ landing_page:
|
|||
items:
|
||||
- classname: devsite-landing-row-100
|
||||
description: >
|
||||
<p>Tensorflow Privacy (TF Privacy) is an open source library developed by teams in Google
|
||||
Research. The library includes implementations of commonly used TensorFlow Optimizers for
|
||||
training ML models with DP. The goal is to enable ML practitioners using standard Tensorflow
|
||||
APIs to train privacy-preserving models by changing only a few lines of code.</p>
|
||||
<p> The differentially private Optimizers can be used in conjunction with high-level APIs
|
||||
<p>
|
||||
Tensorflow Privacy (TF Privacy) is an open source library developed by teams in
|
||||
Google Research. The library includes implementations of commonly used TensorFlow
|
||||
Optimizers for training ML models with DP. The goal is to enable ML practitioners
|
||||
using standard Tensorflow APIs to train privacy-preserving models by changing only a
|
||||
few lines of code.
|
||||
</p>
|
||||
<p>
|
||||
The differentially private optimizers can be used in conjunction with high-level APIs
|
||||
that use the Optimizer class, especially Keras. Additionally, you can find differentially
|
||||
private implementations of some Keras models. All of the Optimizers and models can be found
|
||||
in the <a href="./privacy/api">API Documentation</a>.</p>
|
||||
in the <a href="./privacy/api_docs/python/tf_privacy">API Documentation</a>.</p>
|
||||
</p>
|
||||
|
||||
- classname: devsite-landing-row-cards
|
||||
items:
|
||||
|
|
|
@ -1,11 +1,12 @@
|
|||
# Get Started
|
||||
|
||||
|
||||
This document assumes you are already familiar with differential privacy, and
|
||||
have determined that you would like to implement TF Privacy to achieve
|
||||
differential privacy guarantees in your model(s). If you’re not familiar with
|
||||
differential privacy, please review
|
||||
have determined that you would like to use TF Privacy to implement differential
|
||||
privacy guarantees in your model(s). If you’re not familiar with differential
|
||||
privacy, please review
|
||||
[the overview page](https://tensorflow.org/responsible_ai/privacy/guide). After
|
||||
installing TF Privacy get started by following these steps:
|
||||
installing TF Privacy, get started by following these steps:
|
||||
|
||||
## 1. Choose a differentially private version of an existing Optimizer
|
||||
|
||||
|
@ -36,9 +37,9 @@ microbatches.
|
|||
Train your model using the DP Optimizer (step 1) and vectorized loss (step 2).
|
||||
There are two options for doing this:
|
||||
|
||||
- Pass the optimizer and loss as arguments to `Model.compile` before calling
|
||||
* Pass the optimizer and loss as arguments to `Model.compile` before calling
|
||||
`Model.fit`.
|
||||
- When writing a custom training loop, use `Optimizer.minimize()` on the
|
||||
* When writing a custom training loop, use `Optimizer.minimize()` on the
|
||||
vectorized loss.
|
||||
|
||||
Once this is done, it’s recommended that you tune your hyperparameters. For a
|
||||
|
@ -65,7 +66,7 @@ The three new DP-SGD hyperparameters have the following effects and tradeoffs:
|
|||
utility because it lowers the standard deviation of the noise. However, it
|
||||
will slow down training in terms of time.
|
||||
2. The clipping norm $C$: Since the standard deviation of the noise scales with
|
||||
C, it is probably best to set $C$ to be some quantile (e.g. median, 75th
|
||||
$C$, it is probably best to set $C$ to be some quantile (e.g. median, 75th
|
||||
percentile, 90th percentile) of the gradient norms. Having too large a value
|
||||
of $C$ adds unnecessarily large amounts of noise.
|
||||
3. The noise multiplier $σ$: Of the three hyperparameters, the amount of
|
||||
|
|
|
@ -2,12 +2,12 @@
|
|||
|
||||
Differential privacy is a framework for measuring the privacy guarantees
|
||||
provided by an algorithm and can be expressed using the values ε (epsilon) and δ
|
||||
(delta). Of the two, ε is the more important and more sensitive to the choice of
|
||||
(delta). Of the two, ε is more important and more sensitive to the choice of
|
||||
hyperparameters. Roughly speaking, they mean the following:
|
||||
|
||||
* ε gives a ceiling on how much the probability of a particular output can
|
||||
increase by including (or removing) a single training example. You usually
|
||||
want it to be a small constant (less than 10, or, for more stringent privacy
|
||||
want it to be a small constant (less than 10, or for more stringent privacy
|
||||
guarantees, less than 1). However, this is only an upper bound, and a large
|
||||
value of epsilon may still mean good practical privacy.
|
||||
* δ bounds the probability of an arbitrary change in model behavior. You can
|
||||
|
@ -30,17 +30,16 @@ dataset size and number of epochs. See the
|
|||
[classification privacy tutorial](../tutorials/classification_privacy.ipynb) to
|
||||
see the approach.
|
||||
|
||||
For more detail, you can see
|
||||
For more detail, see
|
||||
[the original DP-SGD paper](https://arxiv.org/pdf/1607.00133.pdf).
|
||||
|
||||
You can use `compute_dp_sgd_privacy`, to find out the epsilon given a fixed
|
||||
delta value for your model [../tutorials/classification_privacy.ipynb]:
|
||||
You can use `compute_dp_sgd_privacy` to find out the epsilon given a fixed delta
|
||||
value for your model [../tutorials/classification_privacy.ipynb]:
|
||||
|
||||
* `q` : the sampling ratio - the probability of an individual training point
|
||||
being included in a mini batch (`batch_size/number_of_examples`).
|
||||
* `noise_multiplier` : A float that governs the amount of noise added during
|
||||
training. Generally, more noise results in better privacy and lower utility.
|
||||
This generally
|
||||
* `steps` : The number of global steps taken.
|
||||
|
||||
A detailed writeup of the theory behind the computation of epsilon and delta is
|
||||
|
|
Loading…
Reference in a new issue