Change README example to use Google DP for accounting instead of deprecated privacy/analysis/rdp_accountant functions.
PiperOrigin-RevId: 449820802
This commit is contained in:
parent
f739f45299
commit
5509adb296
1 changed files with 43 additions and 29 deletions
|
@ -328,12 +328,12 @@ memorized and the privacy of the individual who contributed this data point to
|
||||||
our dataset is respected. We often refer to this probability as the privacy
|
our dataset is respected. We often refer to this probability as the privacy
|
||||||
budget: smaller privacy budgets correspond to stronger privacy guarantees.
|
budget: smaller privacy budgets correspond to stronger privacy guarantees.
|
||||||
|
|
||||||
Accounting required to compute the privacy budget spent to train our machine
|
Google's DP library can be used to compute the privacy budget spent to train our
|
||||||
learning model is another feature provided by TF Privacy. Knowing what level of
|
machine learning model. Knowing what level of differential privacy was achieved
|
||||||
differential privacy was achieved allows us to put into perspective the drop in
|
allows us to put into perspective the drop in utility that is often observed
|
||||||
utility that is often observed when switching to differentially private
|
when switching to differentially private optimization. It also allows us to
|
||||||
optimization. It also allows us to compare two models objectively to determine
|
compare two models objectively to determine which of the two is more
|
||||||
which of the two is more privacy-preserving than the other.
|
privacy-preserving than the other.
|
||||||
|
|
||||||
Before we derive a bound on the privacy guarantee achieved by our optimizer, we
|
Before we derive a bound on the privacy guarantee achieved by our optimizer, we
|
||||||
first need to identify all the parameters that are relevant to measuring the
|
first need to identify all the parameters that are relevant to measuring the
|
||||||
|
@ -378,37 +378,51 @@ We will express our differential privacy guarantee using two parameters:
|
||||||
However, this is only an upper bound, and a large value of epsilon could
|
However, this is only an upper bound, and a large value of epsilon could
|
||||||
still mean good practical privacy.
|
still mean good practical privacy.
|
||||||
|
|
||||||
The TF Privacy library provides two methods relevant to derive privacy
|
To compute the privacy spent using the Google DP library, we need to define a
|
||||||
guarantees achieved from the three parameters outlined in the last code snippet:
|
`PrivacyAccountant` and a `DpEvent`. The `PrivacyAccountant` specifies what
|
||||||
`compute_rdp` and `get_privacy_spent`. These methods are found in its
|
method of privacy accounting will be used. In our case that will be RDP, so we
|
||||||
`analysis.rdp_accountant` module. Here is how to use them.
|
use the `RdpAccountant`. The `DpEvent` is a representation of the log of
|
||||||
|
privacy-impacting actions that have occurred, in our case, the repeated sampling
|
||||||
|
of records and estimation of their mean with Gaussian noise added.
|
||||||
|
|
||||||
First, we need to define a list of orders, at which the Rényi divergence will be
|
To initialize the `PrivacyAccountant`, we need to define a list of orders, at
|
||||||
computed. While some finer points of how to use the RDP accountant are outside
|
which the Rényi divergence will be computed. While some finer points of how to
|
||||||
the scope of this document, it is useful to keep in mind the following. First,
|
use the RDP accountant are outside the scope of this document, it is useful to
|
||||||
there is very little downside in expanding the list of orders for which RDP is
|
keep in mind the following. First, there is very little downside in expanding
|
||||||
computed. Second, the computed privacy budget is typically not very sensitive to
|
the list of orders for which RDP is computed. Second, the computed privacy
|
||||||
the exact value of the order (being close enough will land you in the right
|
budget is typically not very sensitive to the exact value of the order (being
|
||||||
neighborhood). Finally, if you are targeting a particular range of epsilons
|
close enough will land you in the right neighborhood). Finally, if you are
|
||||||
(say, 1—10) and your delta is fixed (say, `10^-5`), then your orders must cover
|
targeting a particular range of epsilons (say, 1—10) and your delta is fixed
|
||||||
the range between `1+ln(1/delta)/10≈2.15` and `1+ln(1/delta)/1≈12.5`. This last
|
(say, `10^-5`), then your orders must cover the range between
|
||||||
rule may appear circular (how do you know what privacy parameters you get
|
`1+ln(1/delta)/10≈2.15` and `1+ln(1/delta)/1≈12.5`. This last rule may appear
|
||||||
without running the privacy accountant?!), one or two adjustments of the range
|
circular (how do you know what privacy parameters you get without running the
|
||||||
of the orders would usually suffice.
|
privacy accountant?!), one or two adjustments of the range of the orders would
|
||||||
|
usually suffice.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
|
orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
|
||||||
rdp = compute_rdp(q=sampling_probability,
|
accountant = privacy_accountant.RdpAccountant(orders)
|
||||||
noise_multiplier=FLAGS.noise_multiplier,
|
|
||||||
steps=steps,
|
|
||||||
orders=orders)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Then, the method `get_privacy_spent` computes the best `epsilon` for a given
|
Next we create a `DpEvent` and feed it to the accountant for processing using
|
||||||
`target_delta` value of delta by taking the minimum over all orders.
|
its `compose` method:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
epsilon = get_privacy_spent(orders, rdp, target_delta=1e-5)[0]
|
event = dp_event.SelfComposedDpEvent(
|
||||||
|
event=dp_event.PoissonSampledDpEvent(
|
||||||
|
sampling_probability=q,
|
||||||
|
event=dp_event.GaussianDpEvent(noise_multiplier)
|
||||||
|
),
|
||||||
|
count=steps)
|
||||||
|
accountant.compose(event)
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally, we can query the accountant for the best `epsilon` at the given
|
||||||
|
`target_delta` by calling the `get_epsilon` method which takes the minimum over
|
||||||
|
all orders.
|
||||||
|
|
||||||
|
```python
|
||||||
|
epsilon = accountant.get_epsilon(target_delta)
|
||||||
```
|
```
|
||||||
|
|
||||||
Running the code snippets above with the hyperparameter values used during
|
Running the code snippets above with the hyperparameter values used during
|
||||||
|
|
Loading…
Reference in a new issue