Add more documentation for gradient_accumulation_steps in keras optimizer.

PiperOrigin-RevId: 469310667
2022-08-22 16:16:12 -07:00 · 2022-08-22 16:16:12 -07:00 · 9f4feade7d
commit 9f4feade7d
parent 9e25eee68b
1 changed files with 17 additions and 5 deletions
--- a/tensorflow_privacy/privacy/optimizers/dp_optimizer_keras.py
+++ b/tensorflow_privacy/privacy/optimizers/dp_optimizer_keras.py
@ -105,11 +105,23 @@ def make_keras_optimizer_class(cls):
    opt.minimize(loss, var_list=[var])
    ```

-    Note that when using this feature effective batch size is
-    `gradient_accumulation_steps * one_step_batch_size` where
-    `one_step_batch_size` size of the batch which is passed to single step
-    of the optimizer. Thus user may have to adjust learning rate, weight decay
-    and possibly other training hyperparameters accordingly.
+    Note that when using this feature,
+    1. effective batch size is `gradient_accumulation_steps * one_step_batch_size`
+      where `one_step_batch_size` is the size of the batch passed to single step
+      of the optimizer. Thus user may have to adjust learning rate, weight decay
+      and possibly other training hyperparameters accordingly.
+    2. effective noise (the noise to be used for privacy computation) is
+       `noise_multiplier * sqrt(gradient_accumulation_steps)`, as the optimizer
+       adds noise of `self._noise_multiplier` to every step. Thus user may have
+       to adjust the `noise_multiplier` or the privacy computation.
+    Additionally, user may need to adjust the batch size in the data generator,
+    or the number of calls to the data generator, depending on the training
+    framework used. For example, when using Keras model.fit(...) with a
+    user-defined data generator, one may need to make the data generator return
+    `one_step_batch_size` examples each time, and scale the `steps_per_epoch`
+    by `gradient_accumulation_steps`. This is because the data generator is
+    called `steps_per_epoch` times per epoch, and one call only returns
+    `one_step_batch_size` (instead of `effective_batch_size`) examples now.
    """.format(
        base_class='tf.keras.optimizers.' + cls.__name__,
        short_base_class=cls.__name__,