Update narrative content.

PiperOrigin-RevId: 394558889
This commit is contained in:
A. Unique TensorFlower 2021-09-02 15:37:43 -07:00
parent fc7504efca
commit b7249e6ab2
3 changed files with 39 additions and 29 deletions

View file

@ -13,16 +13,21 @@ landing_page:
- classname: devsite-landing-row-50 - classname: devsite-landing-row-50
description: > description: >
<p> <p>
Preventing ML models from exposing potentially sensitive information is a critical part of An important aspect of responsible AI usage is ensuring that ML models are prevented from
using AI responsibly. To that end, <i>differentially private stochastic gradient descent exposing potentially sensitive information, such as demographic information or other
(DP-SGD)</i> is a modification to the standard stochastic gradient descent (SGD) algorithm attributes in the training dataset that could be used to identify people.
in machine learning. </p> One way to achieve this is by using differentially private stochastic gradient descent
<p>Models trained with DP-SGD have provable differential privacy (DP) (DP-SGD), which is a modification to the standard stochastic gradient descent (SGD)
guarantees, mitigating the risk of exposing sensitive training data. Intuitively, a model algorithm in machine learning.
trained with differential privacy should not be affected by any single training example in </p>
its data set. DP-SGD techniques can also be used in federated learning to provide user-level <p>
differential privacy. You can learn more about differentially private deep learning in <a Models trained with DP-SGD have measurable differential privacy (DP) improvements, which
href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>. helps mitigate the risk of exposing sensitive training data. Since the purpose of DP is
to help prevent individual data points from being identified, a model trained with DP
should not be affected by any single training example in its training data set. DP-SGD
techniques can also be used in federated learning to provide user-level differential privacy.
You can learn more about differentially private deep learning in
<a href="https://arxiv.org/pdf/1607.00133.pdf">the original paper</a>.
</p> </p>
- code_block: | - code_block: |
@ -58,14 +63,19 @@ landing_page:
items: items:
- classname: devsite-landing-row-100 - classname: devsite-landing-row-100
description: > description: >
<p>Tensorflow Privacy (TF Privacy) is an open source library developed by teams in Google <p>
Research. The library includes implementations of commonly used TensorFlow Optimizers for Tensorflow Privacy (TF Privacy) is an open source library developed by teams in
training ML models with DP. The goal is to enable ML practitioners using standard Tensorflow Google Research. The library includes implementations of commonly used TensorFlow
APIs to train privacy-preserving models by changing only a few lines of code.</p> Optimizers for training ML models with DP. The goal is to enable ML practitioners
<p> The differentially private Optimizers can be used in conjunction with high-level APIs using standard Tensorflow APIs to train privacy-preserving models by changing only a
few lines of code.
</p>
<p>
The differentially private optimizers can be used in conjunction with high-level APIs
that use the Optimizer class, especially Keras. Additionally, you can find differentially that use the Optimizer class, especially Keras. Additionally, you can find differentially
private implementations of some Keras models. All of the Optimizers and models can be found private implementations of some Keras models. All of the Optimizers and models can be found
in the <a href="./privacy/api">API Documentation</a>.</p> in the <a href="./privacy/api_docs/python/tf_privacy">API Documentation</a>.</p>
</p>
- classname: devsite-landing-row-cards - classname: devsite-landing-row-cards
items: items:

View file

@ -1,11 +1,12 @@
# Get Started # Get Started
This document assumes you are already familiar with differential privacy, and This document assumes you are already familiar with differential privacy, and
have determined that you would like to implement TF Privacy to achieve have determined that you would like to use TF Privacy to implement differential
differential privacy guarantees in your model(s). If youre not familiar with privacy guarantees in your model(s). If youre not familiar with differential
differential privacy, please review privacy, please review
[the overview page](https://tensorflow.org/responsible_ai/privacy/guide). After [the overview page](https://tensorflow.org/responsible_ai/privacy/guide). After
installing TF Privacy get started by following these steps: installing TF Privacy, get started by following these steps:
## 1. Choose a differentially private version of an existing Optimizer ## 1. Choose a differentially private version of an existing Optimizer
@ -36,9 +37,9 @@ microbatches.
Train your model using the DP Optimizer (step 1) and vectorized loss (step 2). Train your model using the DP Optimizer (step 1) and vectorized loss (step 2).
There are two options for doing this: There are two options for doing this:
- Pass the optimizer and loss as arguments to `Model.compile` before calling * Pass the optimizer and loss as arguments to `Model.compile` before calling
`Model.fit`. `Model.fit`.
- When writing a custom training loop, use `Optimizer.minimize()` on the * When writing a custom training loop, use `Optimizer.minimize()` on the
vectorized loss. vectorized loss.
Once this is done, its recommended that you tune your hyperparameters. For a Once this is done, its recommended that you tune your hyperparameters. For a
@ -65,7 +66,7 @@ The three new DP-SGD hyperparameters have the following effects and tradeoffs:
utility because it lowers the standard deviation of the noise. However, it utility because it lowers the standard deviation of the noise. However, it
will slow down training in terms of time. will slow down training in terms of time.
2. The clipping norm $C$: Since the standard deviation of the noise scales with 2. The clipping norm $C$: Since the standard deviation of the noise scales with
C, it is probably best to set $C$ to be some quantile (e.g. median, 75th $C$, it is probably best to set $C$ to be some quantile (e.g. median, 75th
percentile, 90th percentile) of the gradient norms. Having too large a value percentile, 90th percentile) of the gradient norms. Having too large a value
of $C$ adds unnecessarily large amounts of noise. of $C$ adds unnecessarily large amounts of noise.
3. The noise multiplier $σ$: Of the three hyperparameters, the amount of 3. The noise multiplier $σ$: Of the three hyperparameters, the amount of

View file

@ -2,12 +2,12 @@
Differential privacy is a framework for measuring the privacy guarantees Differential privacy is a framework for measuring the privacy guarantees
provided by an algorithm and can be expressed using the values ε (epsilon) and δ provided by an algorithm and can be expressed using the values ε (epsilon) and δ
(delta). Of the two, ε is the more important and more sensitive to the choice of (delta). Of the two, ε is more important and more sensitive to the choice of
hyperparameters. Roughly speaking, they mean the following: hyperparameters. Roughly speaking, they mean the following:
* ε gives a ceiling on how much the probability of a particular output can * ε gives a ceiling on how much the probability of a particular output can
increase by including (or removing) a single training example. You usually increase by including (or removing) a single training example. You usually
want it to be a small constant (less than 10, or, for more stringent privacy want it to be a small constant (less than 10, or for more stringent privacy
guarantees, less than 1). However, this is only an upper bound, and a large guarantees, less than 1). However, this is only an upper bound, and a large
value of epsilon may still mean good practical privacy. value of epsilon may still mean good practical privacy.
* δ bounds the probability of an arbitrary change in model behavior. You can * δ bounds the probability of an arbitrary change in model behavior. You can
@ -30,17 +30,16 @@ dataset size and number of epochs. See the
[classification privacy tutorial](../tutorials/classification_privacy.ipynb) to [classification privacy tutorial](../tutorials/classification_privacy.ipynb) to
see the approach. see the approach.
For more detail, you can see For more detail, see
[the original DP-SGD paper](https://arxiv.org/pdf/1607.00133.pdf). [the original DP-SGD paper](https://arxiv.org/pdf/1607.00133.pdf).
You can use `compute_dp_sgd_privacy`, to find out the epsilon given a fixed You can use `compute_dp_sgd_privacy` to find out the epsilon given a fixed delta
delta value for your model [../tutorials/classification_privacy.ipynb]: value for your model [../tutorials/classification_privacy.ipynb]:
* `q` : the sampling ratio - the probability of an individual training point * `q` : the sampling ratio - the probability of an individual training point
being included in a mini batch (`batch_size/number_of_examples`). being included in a mini batch (`batch_size/number_of_examples`).
* `noise_multiplier` : A float that governs the amount of noise added during * `noise_multiplier` : A float that governs the amount of noise added during
training. Generally, more noise results in better privacy and lower utility. training. Generally, more noise results in better privacy and lower utility.
This generally
* `steps` : The number of global steps taken. * `steps` : The number of global steps taken.
A detailed writeup of the theory behind the computation of epsilon and delta is A detailed writeup of the theory behind the computation of epsilon and delta is