Change README example to use Google DP for accounting instead of deprecated privacy/analysis/rdp_accountant functions.

PiperOrigin-RevId: 449820802
2022-05-19 13:29:15 -07:00 · 2022-05-19 13:29:15 -07:00 · 5509adb296
commit 5509adb296
parent f739f45299
1 changed files with 43 additions and 29 deletions
--- a/tutorials/walkthrough/README.md
+++ b/tutorials/walkthrough/README.md
@ -328,12 +328,12 @@ memorized and the privacy of the individual who contributed this data point to
 our dataset is respected. We often refer to this probability as the privacy
 budget: smaller privacy budgets correspond to stronger privacy guarantees.
-Accounting required to compute the privacy budget spent to train our machine
+Google's DP library can be used to compute the privacy budget spent to train our
-learning model is another feature provided by TF Privacy. Knowing what level of
+machine learning model. Knowing what level of differential privacy was achieved
-differential privacy was achieved allows us to put into perspective the drop in
+allows us to put into perspective the drop in utility that is often observed
-utility that is often observed when switching to differentially private
+when switching to differentially private optimization. It also allows us to
-optimization. It also allows us to compare two models objectively to determine
+compare two models objectively to determine which of the two is more
-which of the two is more privacy-preserving than the other.
+privacy-preserving than the other.
 Before we derive a bound on the privacy guarantee achieved by our optimizer, we
 first need to identify all the parameters that are relevant to measuring the
@ -378,37 +378,51 @@ We will express our differential privacy guarantee using two parameters:
    However, this is only an upper bound, and a large value of epsilon could
    still mean good practical privacy.
-The TF Privacy library provides two methods relevant to derive privacy
+To compute the privacy spent using the Google DP library, we need to define a
-guarantees achieved from the three parameters outlined in the last code snippet:
+`PrivacyAccountant` and a `DpEvent`. The `PrivacyAccountant` specifies what
-`compute_rdp` and `get_privacy_spent`. These methods are found in its
+method of privacy accounting will be used. In our case that will be RDP, so we
-`analysis.rdp_accountant` module. Here is how to use them.
+use the `RdpAccountant`. The `DpEvent` is a representation of the log of
 privacy-impacting actions that have occurred, in our case, the repeated sampling
 of records and estimation of their mean with Gaussian noise added.
-First, we need to define a list of orders, at which the Rényi divergence will be
+To initialize the `PrivacyAccountant`, we need to define a list of orders, at
-computed. While some finer points of how to use the RDP accountant are outside
+which the Rényi divergence will be computed. While some finer points of how to
-the scope of this document, it is useful to keep in mind the following. First,
+use the RDP accountant are outside the scope of this document, it is useful to
-there is very little downside in expanding the list of orders for which RDP is
+keep in mind the following. First, there is very little downside in expanding
-computed. Second, the computed privacy budget is typically not very sensitive to
+the list of orders for which RDP is computed. Second, the computed privacy
-the exact value of the order (being close enough will land you in the right
+budget is typically not very sensitive to the exact value of the order (being
-neighborhood). Finally, if you are targeting a particular range of epsilons
+close enough will land you in the right neighborhood). Finally, if you are
-(say, 1—10) and your delta is fixed (say, `10^-5`), then your orders must cover
+targeting a particular range of epsilons (say, 1—10) and your delta is fixed
-the range between `1+ln(1/delta)/10≈2.15` and `1+ln(1/delta)/1≈12.5`. This last
+(say, `10^-5`), then your orders must cover the range between
-rule may appear circular (how do you know what privacy parameters you get
+`1+ln(1/delta)/10≈2.15` and `1+ln(1/delta)/1≈12.5`. This last rule may appear
-without running the privacy accountant?!), one or two adjustments of the range
+circular (how do you know what privacy parameters you get without running the
-of the orders would usually suffice.
+privacy accountant?!), one or two adjustments of the range of the orders would
 usually suffice.
 ```python
 orders = [1 + x / 10. for x in range(1, 100)] + list(range(12, 64))
-rdp = compute_rdp(q=sampling_probability,
+accountant = privacy_accountant.RdpAccountant(orders)
                  noise_multiplier=FLAGS.noise_multiplier,
                  steps=steps,
                  orders=orders)
 ```
-Then, the method `get_privacy_spent` computes the best `epsilon` for a given
+Next we create a `DpEvent` and feed it to the accountant for processing using
-`target_delta` value of delta by taking the minimum over all orders.
+its `compose` method:
 ```python
-epsilon = get_privacy_spent(orders, rdp, target_delta=1e-5)[0]
+event = dp_event.SelfComposedDpEvent(
    event=dp_event.PoissonSampledDpEvent(
        sampling_probability=q,
        event=dp_event.GaussianDpEvent(noise_multiplier)
    ),
    count=steps)
 accountant.compose(event)
 ```
 Finally, we can query the accountant for the best `epsilon` at the given
 `target_delta` by calling the `get_epsilon` method which takes the minimum over
 all orders.
 ```python
 epsilon = accountant.get_epsilon(target_delta)
 ```
 Running the code snippets above with the hyperparameter values used during