Update narrative content.

PiperOrigin-RevId: 394558889
This commit is contained in:
A. Unique TensorFlower 2021-09-02 15:37:43 -07:00
parent fc7504efca
commit b7249e6ab2
3 changed files with 39 additions and 29 deletions

View file

@ -13,16 +13,21 @@ landing_page:
- classname: devsite-landing-row-50
description: >
<p>
Preventing ML models from exposing potentially sensitive information is a critical part of
using AI responsibly. To that end, <i>differentially private stochastic gradient descent
(DP-SGD)</i> is a modification to the standard stochastic gradient descent (SGD) algorithm
in machine learning. </p>
<p>Models trained with DP-SGD have provable differential privacy (DP)
guarantees, mitigating the risk of exposing sensitive training data. Intuitively, a model
trained with differential privacy should not be affected by any single training example in
its data set. DP-SGD techniques can also be used in federated learning to provide user-level
differential privacy. You can learn more about differentially private deep learning in <a
href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>.
An important aspect of responsible AI usage is ensuring that ML models are prevented from
exposing potentially sensitive information, such as demographic information or other
attributes in the training dataset that could be used to identify people.
One way to achieve this is by using differentially private stochastic gradient descent
(DP-SGD), which is a modification to the standard stochastic gradient descent (SGD)
algorithm in machine learning.
</p>
<p>
Models trained with DP-SGD have measurable differential privacy (DP) improvements, which
helps mitigate the risk of exposing sensitive training data. Since the purpose of DP is
to help prevent individual data points from being identified, a model trained with DP
should not be affected by any single training example in its training data set. DP-SGD
techniques can also be used in federated learning to provide user-level differential privacy.
You can learn more about differentially private deep learning in
<a href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>.
</p>
- code_block: |
@ -58,14 +63,19 @@ landing_page:
items:
- classname: devsite-landing-row-100
description: >
<p>Tensorflow Privacy (TF Privacy) is an open source library developed by teams in Google
Research. The library includes implementations of commonly used TensorFlow Optimizers for
training ML models with DP. The goal is to enable ML practitioners using standard Tensorflow
APIs to train privacy-preserving models by changing only a few lines of code.</p>
<p> The differentially private Optimizers can be used in conjunction with high-level APIs
<p>
Tensorflow Privacy (TF Privacy) is an open source library developed by teams in
Google Research. The library includes implementations of commonly used TensorFlow
Optimizers for training ML models with DP. The goal is to enable ML practitioners
using standard Tensorflow APIs to train privacy-preserving models by changing only a
few lines of code.
</p>
<p>
The differentially private optimizers can be used in conjunction with high-level APIs
that use the Optimizer class, especially Keras. Additionally, you can find differentially
private implementations of some Keras models. All of the Optimizers and models can be found
in the <a href="./privacy/api">API Documentation</a>.</p>
in the <a href="./privacy/api_docs/python/tf_privacy">API Documentation</a>.</p>
</p>
- classname: devsite-landing-row-cards
items:

View file

@ -1,11 +1,12 @@
# Get Started
This document assumes you are already familiar with differential privacy, and
have determined that you would like to implement TF Privacy to achieve
differential privacy guarantees in your model(s). If youre not familiar with
differential privacy, please review
have determined that you would like to use TF Privacy to implement differential
privacy guarantees in your model(s). If youre not familiar with differential
privacy, please review
[the overview page](https://tensorflow.org/responsible_ai/privacy/guide). After
installing TF Privacy get started by following these steps:
installing TF Privacy, get started by following these steps:
## 1. Choose a differentially private version of an existing Optimizer
@ -36,9 +37,9 @@ microbatches.
Train your model using the DP Optimizer (step 1) and vectorized loss (step 2).
There are two options for doing this:
- Pass the optimizer and loss as arguments to `Model.compile` before calling
* Pass the optimizer and loss as arguments to `Model.compile` before calling
`Model.fit`.
- When writing a custom training loop, use `Optimizer.minimize()` on the
* When writing a custom training loop, use `Optimizer.minimize()` on the
vectorized loss.
Once this is done, its recommended that you tune your hyperparameters. For a
@ -65,7 +66,7 @@ The three new DP-SGD hyperparameters have the following effects and tradeoffs:
utility because it lowers the standard deviation of the noise. However, it
will slow down training in terms of time.
2. The clipping norm $C$: Since the standard deviation of the noise scales with
C, it is probably best to set $C$ to be some quantile (e.g. median, 75th
$C$, it is probably best to set $C$ to be some quantile (e.g. median, 75th
percentile, 90th percentile) of the gradient norms. Having too large a value
of $C$ adds unnecessarily large amounts of noise.
3. The noise multiplier $σ$: Of the three hyperparameters, the amount of

View file

@ -2,12 +2,12 @@
Differential privacy is a framework for measuring the privacy guarantees
provided by an algorithm and can be expressed using the values ε (epsilon) and δ
(delta). Of the two, ε is the more important and more sensitive to the choice of
(delta). Of the two, ε is more important and more sensitive to the choice of
hyperparameters. Roughly speaking, they mean the following:
* ε gives a ceiling on how much the probability of a particular output can
increase by including (or removing) a single training example. You usually
want it to be a small constant (less than 10, or, for more stringent privacy
want it to be a small constant (less than 10, or for more stringent privacy
guarantees, less than 1). However, this is only an upper bound, and a large
value of epsilon may still mean good practical privacy.
* δ bounds the probability of an arbitrary change in model behavior. You can
@ -30,17 +30,16 @@ dataset size and number of epochs. See the
[classification privacy tutorial](../tutorials/classification_privacy.ipynb) to
see the approach.
For more detail, you can see
For more detail, see
[the original DP-SGD paper](https://arxiv.org/pdf/1607.00133.pdf).
You can use `compute_dp_sgd_privacy`, to find out the epsilon given a fixed
delta value for your model [../tutorials/classification_privacy.ipynb]:
You can use `compute_dp_sgd_privacy` to find out the epsilon given a fixed delta
value for your model [../tutorials/classification_privacy.ipynb]:
* `q` : the sampling ratio - the probability of an individual training point
being included in a mini batch (`batch_size/number_of_examples`).
* `noise_multiplier` : A float that governs the amount of noise added during
training. Generally, more noise results in better privacy and lower utility.
This generally
* `steps` : The number of global steps taken.
A detailed writeup of the theory behind the computation of epsilon and delta is