tensorflow_privacy/tutorials/Classification_Privacy.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Classification_Privacy.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/anirudh161/privacy/blob/add-dpsgd-keras-tutorial/tutorials/Classification_Privacy.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XAVN6c8prKOL",
        "colab_type": "text"
      },
      "source": [
        "##### Copyright 2019 The TensorFlow Authors.\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SassPC7WQAUO",
        "colab_type": "code",
        "cellView": "both",
        "colab": {}
      },
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KwDK47gfLsYf",
        "colab_type": "text"
      },
      "source": [
        "# Implement Differential Privacy with TensorFlow Privacy"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "MfBg1C5NB3X0"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/not_a_real_link\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/privacy/blob/master/tutorials/Classification_Privacy.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/docs/blob/master/tools/templates/notebook.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/docs/tools/templates/notebook.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "00fQV7e0Unz3",
        "colab_type": "text"
      },
      "source": [
        "## Overview"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TUphKzYu01O9",
        "colab_type": "text"
      },
      "source": [
        "[Differential privacy](https://en.wikipedia.org/wiki/Differential_privacy) (DP) is a framework that allows for measuring the privacy guarantees provided by a Machine Learning (ML) algorithm with respect to its input data. Recent advances allow the training of ML models with DP, greatly mitigating the risk of exposing sensitive training data in ML. Intuitively, a model trained with DP should not be affected by any single training example (or small set of training examples) in its dataset. \n",
        "\n",
        "The basic idea of this approach, called differentially private stochastic gradient descent (DP-SGD), is to modify the gradients\n",
        "used in stochastic gradient descent (SGD), which lies at the core of almost all deep learning algorithms. Models trained with DP-SGD provide provable differential privacy guarantees for their input data. There are two modifications made to the vanilla SGD algorithm:\n",
        "\n",
        "1. The sensitivity of each gradient is bounded by *clipping* the gradient computed for each training point. This limits how much each training point can possibly impact model parameters.\n",
        "2. *Random noise* is sampled and added to the clipped gradients to make it statistically impossible to know whether or not a particular data point was included in the training dataset by comparing the updates SGD applies when it operates with or without this particular data point in the training dataset.\n",
        "\n",
        "This tutorial uses [tf.keras](https://www.tensorflow.org/guide/keras) to train a convolutional neural network (CNN) to recognize handwritten digits with the DP-SGD optimizer provided by the TensorFlow Privacy library. TensorFlow Privacy provides code that wraps an existing TensorFlow optimizer to create a variant that implements DP-SGD."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ijJYKVc05DYX",
        "colab_type": "text"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ef56gCUqrdVn",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from __future__ import absolute_import\n",
        "from __future__ import division\n",
        "from __future__ import print_function\n",
        "\n",
        "import tensorflow as tf\n",
        "\n",
        "import numpy as np"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r_fVhfUyeI3d",
        "colab_type": "text"
      },
      "source": [
        "Install TensorFlow Privacy."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RseeuA7veIHU",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!pip install tensorflow_privacy\n",
        "\n",
        "from tensorflow_privacy.privacy.analysis import compute_dp_sgd_privacy\n",
        "from tensorflow_privacy.privacy.optimizers.dp_optimizer import DPGradientDescentGaussianOptimizer"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mU1p8N7M5Mmn",
        "colab_type": "text"
      },
      "source": [
        "## Load and pre-process the dataset\n",
        "\n",
        "Load the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset and prepare the data for training."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_1ML23FlueTr",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "train, test = tf.keras.datasets.mnist.load_data()\n",
        "train_data, train_labels = train\n",
        "test_data, test_labels = test\n",
        "\n",
        "train_data = np.array(train_data, dtype=np.float32) / 255\n",
        "test_data = np.array(test_data, dtype=np.float32) / 255\n",
        "\n",
        "train_data = train_data.reshape(train_data.shape[0], 28, 28, 1)\n",
        "test_data = test_data.reshape(test_data.shape[0], 28, 28, 1)\n",
        "\n",
        "train_labels = np.array(train_labels, dtype=np.int32)\n",
        "test_labels = np.array(test_labels, dtype=np.int32)\n",
        "\n",
        "train_labels = tf.keras.utils.to_categorical(train_labels, num_classes=10)\n",
        "test_labels = tf.keras.utils.to_categorical(test_labels, num_classes=10)\n",
        "\n",
        "assert train_data.min() == 0.\n",
        "assert train_data.max() == 1.\n",
        "assert test_data.min() == 0.\n",
        "assert test_data.max() == 1."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xVDcswOCtlr3",
        "colab_type": "text"
      },
      "source": [
        "## Define and tune learning model hyperparameters\n",
        "Set learning model hyperparamter values. \n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "E14tL1vUuTRV",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "epochs = 15\n",
        "batch_size = 250"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qXNp_25y7JP2",
        "colab_type": "text"
      },
      "source": [
        "DP-SGD has three privacy-specific hyperparameters and one existing hyperamater that you must tune:\n",
        "\n",
        "1. `l2_norm_clip` (float) - The maximum Euclidean (L2) norm of each individual gradient that is computed on an individual training example from a minibatch. This hyperparameter is used to bound the optimizer's sensitivity to individual training points. \n",
        "2. `noise_multiplier` (float) - The amount of noise sampled and added to gradients during training. Generally, more noise results in better privacy (often, but not necessarily, at the expense of lower utility).\n",
        "3.   `microbatches` (int) - The input data for each step (i.e., batch) of your original training algorithm is split into this many microbatches. Generally, increasing this will improve your utility but increase your overall training time. The total number of examples consumed in one global step remains the same. Your input batch size should be an integer multiple of the number of microbatches.\n",
        "4. `learning_rate` (float) - This hyperparameter already exists in vanilla SGD. The higher the learning rate, the more each update matters. If the updates are noisy (such as when the additive noise is large compared to the clipping threshold), the learning rate must be kept low for the training procedure to converge. \n",
        "\n",
        "Use the hyperparameter values below to obtain a reasonably accurate model (95% test accuracy):"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pVw_r2Mq7ntd",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "l2_norm_clip = 1.5\n",
        "noise_multiplier = 1.3\n",
        "num_microbatches = 250\n",
        "learning_rate = 0.25\n",
        "\n",
        "if batch_size % num_microbatches != 0:\n",
        "  raise ValueError('Batch size should be an integer multiple of the number of microbatches')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wXAmHcNOmHc5",
        "colab_type": "text"
      },
      "source": [
        "## Build the learning model\n",
        "\n",
        "Define a convolutional neural network as the learning model. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oCOo8aOLmFta",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "model = tf.keras.Sequential([\n",
        "    tf.keras.layers.Conv2D(16, 8,\n",
        "                           strides=2,\n",
        "                           padding='same',\n",
        "                           activation='relu',\n",
        "                           input_shape=(28, 28, 1)),\n",
        "    tf.keras.layers.MaxPool2D(2, 1),\n",
        "    tf.keras.layers.Conv2D(32, 4,\n",
        "                           strides=2,\n",
        "                           padding='valid',\n",
        "                           activation='relu'),\n",
        "    tf.keras.layers.MaxPool2D(2, 1),\n",
        "    tf.keras.layers.Flatten(),\n",
        "    tf.keras.layers.Dense(32, activation='relu'),\n",
        "    tf.keras.layers.Dense(10, activation='softmax')\n",
        "])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FT4lByFg-I_r",
        "colab_type": "text"
      },
      "source": [
        "Define the optimizer and loss function for the learning model. Compute the loss as a vector of losses per-example rather than as the mean over a minibatch to support gradient manipulation over each training point. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bqBvjCf5-ZXy",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "optimizer = DPGradientDescentGaussianOptimizer(\n",
        "    l2_norm_clip=l2_norm_clip,\n",
        "    noise_multiplier=noise_multiplier,\n",
        "    num_microbatches=num_microbatches,\n",
        "    learning_rate=learning_rate)\n",
        "\n",
        "loss = tf.keras.losses.CategoricalCrossentropy(\n",
        "    from_logits=True, reduction=tf.losses.Reduction.NONE)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LI_3nXzEGmrP",
        "colab_type": "text"
      },
      "source": [
        "## Compile and train the learning model\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "z4iV03VqG1Bo",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])\n",
        "\n",
        "model.fit(train_data, train_labels,\n",
        "          epochs=epochs,\n",
        "          validation_data=(test_data, test_labels),\n",
        "          batch_size=batch_size)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TL7_lX5sHCTI",
        "colab_type": "text"
      },
      "source": [
        "## Measure the differential privacy guarantee\n",
        "\n",
        "Perform a privacy analysis to measure the DP guarantee achieved by an ML algorithm. Knowing the level of DP achieved enables the objective comparison of two models to determine which of the two is more privacy-preserving. At a high level, a privacy analysis measures how including or excluding any particular point in the training dataset is likely to change the probability that the ML algorithm learns any particular set of parameters. \n",
        "\n",
        "This probability is sometimes called the **privacy budget**. A lower privacy budget ensures a stronger privacy guarantee. Intuitively, this is because if a single training point does not affect the outcome of learning, the information contained in the training point cannot be memorized by the ML algorithm, and the privacy of the individual who contributed this training point to the dataset is preserved.\n",
        "\n",
        "In this tutorial, the privacy analysis is performed in the framework of Rényi Differential Privacy (RDP), which is a generalization of pure DP based on [this paper](https://arxiv.org/abs/1702.07476) that is particularly well suited for DP-SGD.\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wUEk25pgmnm-",
        "colab_type": "text"
      },
      "source": [
        "Two metrics are used to express the DP guarantee of an ML algorithm:\n",
        "\n",
        "1.   Delta ($\\delta$) - Bounds the probability of the privacy guarantee not holding. A rule of thumb is to set it to be less than the inverse of the size of the training dataset. In this tutorial, it is set to **10^-5** as the MNIST dataset has 60,000 training points.\n",
        "2.   Epsilon ($\\epsilon$) - Measures the strength of the privacy guarantee by bounding how much the probability of a particular model output can vary by including (or excluding) a single training point. A smaller value for $\\epsilon$ implies a better privacy guarantee. However, the $\\epsilon$ value is only an upper bound and a large value could still mean good practical privacy.\n",
        "\n",
        "Tensorflow Privacy provides a tool, `compute_dp_sgd_privacy.py`, to compute the value of $\\epsilon$ given a fixed value of $\\delta$ and the following hyperparameters from the training process:\n",
        "\n",
        "1.   The total number of points in the training data, `n`.\n",
        "2. The `batch_size`.\n",
        "3.   The `noise_multiplier`.\n",
        "4. The number of `epochs` of training.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ws8-nVuVDgtJ",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "compute_dp_sgd_privacy.compute_dp_sgd_privacy(n=60000, batch_size=250, noise_multiplier=1.3, epochs=15, delta=1e-5)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "c-KyttEWFRDc",
        "colab_type": "text"
      },
      "source": [
        "The tool reports that for the hyperparameters chosen above, the trained model has an $\\epsilon$ value of 1.18."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SA_9HMGBWFM3",
        "colab_type": "text"
      },
      "source": [
        "## Summary\n",
        "In this tutorial, you learned about differential privacy (DP) and how you can implement DP principles in existing ML algorithms to provide privacy guarantees for training data. In particular, you learned how to:\n",
        "*   Wrap existing optimizers (e.g., SGD, Adam) into their differentially private counterparts using TensorFlow Privacy\n",
        "*   Tune hyperparameters introduced by differentially private machine learning\n",
        "*   Measure the privacy guarantee provided using analysis tools included in TensorFlow Privacy"
      ]
    }
  ]
}
Add DPSGD-Keras Tutorial 2019-10-24 16:25:13 -06:00			`{`
			`"nbformat": 4,`
			`"nbformat_minor": 0,`
			`"metadata": {`
			`"colab": {`
			`"name": "Classification_Privacy.ipynb",`
			`"provenance": [],`
			`"collapsed_sections": [],`
			`"toc_visible": true,`
			`"include_colab_link": true`
			`},`
			`"kernelspec": {`
			`"name": "python3",`
			`"display_name": "Python 3"`
			`}`
			`},`
			`"cells": [`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "view-in-github",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"<a href=\"https://colab.research.google.com/github/anirudh161/privacy/blob/add-dpsgd-keras-tutorial/tutorials/Classification_Privacy.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "XAVN6c8prKOL",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"##### Copyright 2019 The TensorFlow Authors.\n",`
			`"\n",`
			`"\n"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "SassPC7WQAUO",`
			`"colab_type": "code",`
			`"cellView": "both",`
			`"colab": {}`
			`},`
			`"source": [`
			`"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",`
			`"# you may not use this file except in compliance with the License.\n",`
			`"# You may obtain a copy of the License at\n",`
			`"#\n",`
			`"# https://www.apache.org/licenses/LICENSE-2.0\n",`
			`"#\n",`
			`"# Unless required by applicable law or agreed to in writing, software\n",`
			`"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",`
			`"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",`
			`"# See the License for the specific language governing permissions and\n",`
			`"# limitations under the License."`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "KwDK47gfLsYf",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"# Implement Differential Privacy with TensorFlow Privacy"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"colab_type": "text",`
			`"id": "MfBg1C5NB3X0"`
			`},`
			`"source": [`
			`"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",`
			`" <td>\n",`
			`" <a target=\"_blank\" href=\"https://www.tensorflow.org/not_a_real_link\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",`
			`" </td>\n",`
			`" <td>\n",`
Change Colab Link 2019-10-24 16:30:27 -06:00			`" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/privacy/blob/master/tutorials/Classification_Privacy.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",`
Add DPSGD-Keras Tutorial 2019-10-24 16:25:13 -06:00			`" </td>\n",`
			`" <td>\n",`
			`" <a target=\"_blank\" href=\"https://github.com/tensorflow/docs/blob/master/tools/templates/notebook.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",`
			`" </td>\n",`
			`" <td>\n",`
			`" <a href=\"https://storage.googleapis.com/tensorflow_docs/docs/tools/templates/notebook.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",`
			`" </td>\n",`
			`"</table>"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "00fQV7e0Unz3",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Overview"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "TUphKzYu01O9",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"[Differential privacy](https://en.wikipedia.org/wiki/Differential_privacy) (DP) is a framework that allows for measuring the privacy guarantees provided by a Machine Learning (ML) algorithm with respect to its input data. Recent advances allow the training of ML models with DP, greatly mitigating the risk of exposing sensitive training data in ML. Intuitively, a model trained with DP should not be affected by any single training example (or small set of training examples) in its dataset. \n",`
			`"\n",`
			`"The basic idea of this approach, called differentially private stochastic gradient descent (DP-SGD), is to modify the gradients\n",`
			`"used in stochastic gradient descent (SGD), which lies at the core of almost all deep learning algorithms. Models trained with DP-SGD provide provable differential privacy guarantees for their input data. There are two modifications made to the vanilla SGD algorithm:\n",`
			`"\n",`
			`"1. The sensitivity of each gradient is bounded by clipping the gradient computed for each training point. This limits how much each training point can possibly impact model parameters.\n",`
			`"2. Random noise is sampled and added to the clipped gradients to make it statistically impossible to know whether or not a particular data point was included in the training dataset by comparing the updates SGD applies when it operates with or without this particular data point in the training dataset.\n",`
			`"\n",`
			`"This tutorial uses [tf.keras](https://www.tensorflow.org/guide/keras) to train a convolutional neural network (CNN) to recognize handwritten digits with the DP-SGD optimizer provided by the TensorFlow Privacy library. TensorFlow Privacy provides code that wraps an existing TensorFlow optimizer to create a variant that implements DP-SGD."`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "ijJYKVc05DYX",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Setup"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "ef56gCUqrdVn",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"from __future__ import absolute_import\n",`
			`"from __future__ import division\n",`
			`"from __future__ import print_function\n",`
			`"\n",`
			`"import tensorflow as tf\n",`
			`"\n",`
			`"import numpy as np"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "r_fVhfUyeI3d",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"Install TensorFlow Privacy."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "RseeuA7veIHU",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"!pip install tensorflow_privacy\n",`
			`"\n",`
			`"from tensorflow_privacy.privacy.analysis import compute_dp_sgd_privacy\n",`
			`"from tensorflow_privacy.privacy.optimizers.dp_optimizer import DPGradientDescentGaussianOptimizer"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "mU1p8N7M5Mmn",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Load and pre-process the dataset\n",`
			`"\n",`
			`"Load the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset and prepare the data for training."`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "_1ML23FlueTr",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"train, test = tf.keras.datasets.mnist.load_data()\n",`
			`"train_data, train_labels = train\n",`
			`"test_data, test_labels = test\n",`
			`"\n",`
			`"train_data = np.array(train_data, dtype=np.float32) / 255\n",`
			`"test_data = np.array(test_data, dtype=np.float32) / 255\n",`
			`"\n",`
			`"train_data = train_data.reshape(train_data.shape[0], 28, 28, 1)\n",`
			`"test_data = test_data.reshape(test_data.shape[0], 28, 28, 1)\n",`
			`"\n",`
			`"train_labels = np.array(train_labels, dtype=np.int32)\n",`
			`"test_labels = np.array(test_labels, dtype=np.int32)\n",`
			`"\n",`
			`"train_labels = tf.keras.utils.to_categorical(train_labels, num_classes=10)\n",`
			`"test_labels = tf.keras.utils.to_categorical(test_labels, num_classes=10)\n",`
			`"\n",`
			`"assert train_data.min() == 0.\n",`
			`"assert train_data.max() == 1.\n",`
			`"assert test_data.min() == 0.\n",`
			`"assert test_data.max() == 1."`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "xVDcswOCtlr3",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Define and tune learning model hyperparameters\n",`
			`"Set learning model hyperparamter values. \n",`
			`"\n"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "E14tL1vUuTRV",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"epochs = 15\n",`
			`"batch_size = 250"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "qXNp_25y7JP2",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"DP-SGD has three privacy-specific hyperparameters and one existing hyperamater that you must tune:\n",`
			`"\n",`
			"1. `l2_norm_clip` (float) - The maximum Euclidean (L2) norm of each individual gradient that is computed on an individual training example from a minibatch. This hyperparameter is used to bound the optimizer's sensitivity to individual training points. \n",
			"2. `noise_multiplier` (float) - The amount of noise sampled and added to gradients during training. Generally, more noise results in better privacy (often, but not necessarily, at the expense of lower utility).\n",
			"3. `microbatches` (int) - The input data for each step (i.e., batch) of your original training algorithm is split into this many microbatches. Generally, increasing this will improve your utility but increase your overall training time. The total number of examples consumed in one global step remains the same. Your input batch size should be an integer multiple of the number of microbatches.\n",
			"4. `learning_rate` (float) - This hyperparameter already exists in vanilla SGD. The higher the learning rate, the more each update matters. If the updates are noisy (such as when the additive noise is large compared to the clipping threshold), the learning rate must be kept low for the training procedure to converge. \n",
			`"\n",`
			`"Use the hyperparameter values below to obtain a reasonably accurate model (95% test accuracy):"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "pVw_r2Mq7ntd",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"l2_norm_clip = 1.5\n",`
			`"noise_multiplier = 1.3\n",`
			`"num_microbatches = 250\n",`
			`"learning_rate = 0.25\n",`
			`"\n",`
			`"if batch_size % num_microbatches != 0:\n",`
			`" raise ValueError('Batch size should be an integer multiple of the number of microbatches')"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "wXAmHcNOmHc5",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Build the learning model\n",`
			`"\n",`
			`"Define a convolutional neural network as the learning model. "`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "oCOo8aOLmFta",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"model = tf.keras.Sequential([\n",`
			`" tf.keras.layers.Conv2D(16, 8,\n",`
			`" strides=2,\n",`
			`" padding='same',\n",`
			`" activation='relu',\n",`
			`" input_shape=(28, 28, 1)),\n",`
			`" tf.keras.layers.MaxPool2D(2, 1),\n",`
			`" tf.keras.layers.Conv2D(32, 4,\n",`
			`" strides=2,\n",`
			`" padding='valid',\n",`
			`" activation='relu'),\n",`
			`" tf.keras.layers.MaxPool2D(2, 1),\n",`
			`" tf.keras.layers.Flatten(),\n",`
			`" tf.keras.layers.Dense(32, activation='relu'),\n",`
			`" tf.keras.layers.Dense(10, activation='softmax')\n",`
			`"])"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "FT4lByFg-I_r",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"Define the optimizer and loss function for the learning model. Compute the loss as a vector of losses per-example rather than as the mean over a minibatch to support gradient manipulation over each training point. "`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "bqBvjCf5-ZXy",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"optimizer = DPGradientDescentGaussianOptimizer(\n",`
			`" l2_norm_clip=l2_norm_clip,\n",`
			`" noise_multiplier=noise_multiplier,\n",`
			`" num_microbatches=num_microbatches,\n",`
			`" learning_rate=learning_rate)\n",`
			`"\n",`
			`"loss = tf.keras.losses.CategoricalCrossentropy(\n",`
			`" from_logits=True, reduction=tf.losses.Reduction.NONE)"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "LI_3nXzEGmrP",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Compile and train the learning model\n"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "z4iV03VqG1Bo",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])\n",`
			`"\n",`
			`"model.fit(train_data, train_labels,\n",`
			`" epochs=epochs,\n",`
			`" validation_data=(test_data, test_labels),\n",`
			`" batch_size=batch_size)"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "TL7_lX5sHCTI",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Measure the differential privacy guarantee\n",`
			`"\n",`
			`"Perform a privacy analysis to measure the DP guarantee achieved by an ML algorithm. Knowing the level of DP achieved enables the objective comparison of two models to determine which of the two is more privacy-preserving. At a high level, a privacy analysis measures how including or excluding any particular point in the training dataset is likely to change the probability that the ML algorithm learns any particular set of parameters. \n",`
			`"\n",`
			`"This probability is sometimes called the privacy budget. A lower privacy budget ensures a stronger privacy guarantee. Intuitively, this is because if a single training point does not affect the outcome of learning, the information contained in the training point cannot be memorized by the ML algorithm, and the privacy of the individual who contributed this training point to the dataset is preserved.\n",`
			`"\n",`
			`"In this tutorial, the privacy analysis is performed in the framework of Rényi Differential Privacy (RDP), which is a generalization of pure DP based on [this paper](https://arxiv.org/abs/1702.07476) that is particularly well suited for DP-SGD.\n",`
			`"\n",`
			`"\n",`
			`"\n"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "wUEk25pgmnm-",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"Two metrics are used to express the DP guarantee of an ML algorithm:\n",`
			`"\n",`
			`"1. Delta ($\\delta$) - Bounds the probability of the privacy guarantee not holding. A rule of thumb is to set it to be less than the inverse of the size of the training dataset. In this tutorial, it is set to 10^-5 as the MNIST dataset has 60,000 training points.\n",`
			`"2. Epsilon ($\\epsilon$) - Measures the strength of the privacy guarantee by bounding how much the probability of a particular model output can vary by including (or excluding) a single training point. A smaller value for $\\epsilon$ implies a better privacy guarantee. However, the $\\epsilon$ value is only an upper bound and a large value could still mean good practical privacy.\n",`
			`"\n",`
			"Tensorflow Privacy provides a tool, `compute_dp_sgd_privacy.py`, to compute the value of $\\epsilon$ given a fixed value of $\\delta$ and the following hyperparameters from the training process:\n",
			`"\n",`
			"1. The total number of points in the training data, `n`.\n",
			"2. The `batch_size`.\n",
			"3. The `noise_multiplier`.\n",
			"4. The number of `epochs` of training.\n",
			`"\n"`
			`]`
			`},`
			`{`
			`"cell_type": "code",`
			`"metadata": {`
			`"id": "ws8-nVuVDgtJ",`
			`"colab_type": "code",`
			`"colab": {}`
			`},`
			`"source": [`
			`"compute_dp_sgd_privacy.compute_dp_sgd_privacy(n=60000, batch_size=250, noise_multiplier=1.3, epochs=15, delta=1e-5)"`
			`],`
			`"execution_count": 0,`
			`"outputs": []`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "c-KyttEWFRDc",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"The tool reports that for the hyperparameters chosen above, the trained model has an $\\epsilon$ value of 1.18."`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {`
			`"id": "SA_9HMGBWFM3",`
			`"colab_type": "text"`
			`},`
			`"source": [`
			`"## Summary\n",`
			`"In this tutorial, you learned about differential privacy (DP) and how you can implement DP principles in existing ML algorithms to provide privacy guarantees for training data. In particular, you learned how to:\n",`
			`"* Wrap existing optimizers (e.g., SGD, Adam) into their differentially private counterparts using TensorFlow Privacy\n",`
			`"* Tune hyperparameters introduced by differentially private machine learning\n",`
			`"* Measure the privacy guarantee provided using analysis tools included in TensorFlow Privacy"`
			`]`
			`}`
			`]`
Change Colab Link 2019-10-24 16:30:27 -06:00			`}`