Open sourcing membership inference attack.

PiperOrigin-RevId: 317958055
2020-06-23 16:11:40 -07:00 · 2020-06-23 16:11:40 -07:00 · 88dd8771bf
commit 88dd8771bf
parent 1fb9b80d90
9 changed files with 2000 additions and 0 deletions
--- a/tensorflow_privacy/privacy/membership_inference_attack/README.md
+++ b/tensorflow_privacy/privacy/membership_inference_attack/README.md
@ -0,0 +1,238 @@
 # Membership inference attack functionality
 The goal is to provide empirical tests of "how much information a machine
 learning model has remembered about its training data". To this end, only the
 outputs of the model are used (e.g., losses, logits, predictions). From those
 alone, the attacks try to infer whether the corresponding inputs were part of
 the training set.
 > NOTE: Only the loss values are needed for some examples used during training
 > and some examples that have not been used during training (e.g., some examples
 > from the test set). No access to actual input data is needed. In case of
 > classification models, one can additionally (or instead of losses) provide
 > logits or output probabilities for stronger attacks.
 The vulnerability of a model is measured via the area under the ROC-curve
 (`auc`) or via max{|fpr - tpr|} (`advantage`) of the attack classifier. These
 measures are very closely related.
 ## Highest level -- get attack summary
 ### Basic usage
 On the highest level, there is the `run_all_attacks_and_create_summary`
 function, which chooses sane default options to run a host of (fairly simple)
 attacks behind the scenes (depending on which data is fed in), computes the most
 important measures and returns a summary of the results as a string of english
 language (as well as optionally a python dictionary containing all results with
 descriptive keys).
 > NOTE: The train and test sets are balanced internally, i.e., an equal number
 > of in-training and out-of-training examples is chosen for the attacks
 > (whichever has fewer examples). These are subsampled uniformly at random
 > without replacement from the larger of the two.
 The simplest possible usage is
 ```python
 from tensorflow_privacy.privacy.membership_inference_attack import membership_inference_attack as mia
 # Evaluate your model on training and test examples to get
 # loss_train  shape: (n_train, )
 # loss_test  shape: (n_test, )
 summary, results = mia.run_all_attacks_and_create_summary(loss_train, loss_test, return_dict=True)
 print(results)
 # -> {'auc': 0.7044,
 #     'best_attacker_auc': 'all_thresh_loss_auc',
 #     'advantage': 0.3116,
 #     'best_attacker_auc': 'all_thresh_loss_advantage'}
 ```
 > NOTE: The keyword argument `return_dict` specified whether in addition to the
 > `summary` the function also returns a python dictionary with the results.
 If the model is a classifier, the logits or output probabilities (i.e., the
 softmax of logits) can also be provided to perform stronger attacks.
 > NOTE: The `logits_train` and `logits_test` arguments can also be filled with
 > output probabilities per class ("posteriors").
 ```python
 # logits_train  shape: (n_train, n_classes)
 # logits_test  shape: (n_test, n_classes)
 summary, results = mia.run_all_attacks_and_create_summary(loss_train, loss_test, logits_train,
                                      logits_test, return_dict=True)
 print(results)
 # -> {'auc': 0.5382,
 #     'best_attacker_auc': 'all_lr_logits_loss_test_auc',
 #     'advantage': 0.0572,
 #     'best_attacker_auc': 'all_mlp_logits_loss_test_advantage'}
 ```
 The `summary` will be a string in natural language describing the results in
 more detail, e.g.,
 ```
 ========== AUC ==========
 The best attack (all_lr_logits_loss_test_auc) achieved an auc of 0.5382.
 ========== ADVANTAGE ==========
 The best attack (all_mlp_logits_loss_test_advantage) achieved an advantage of 0.0572.
 ```
 Similarly, we can run attacks on the logits alone, without access to losses:
 ```python
 summary, results = mia.run_all_attacks_and_create_summary(logits_train=logits_train,
                                      logits_test=logits_test,
                                      return_dict=True)
 print(results)
 # -> {'auc': 0.9278,
 #     'best_attacker_auc': 'all_rf_logits_test_auc',
 #     'advantage': 0.6991,
 #     'best_attacker_auc': 'all_rf_logits_test_advantage'}
 ```
 ### Advanced usage
 Finally, if we also have access to the true labels of the training and test
 inputs, we can run the attacks for each class separately. If labels *and* logits
 are provided, attacks only for misclassified (typically uncertain) examples are
 also performed.
 ```python
 summary, results = mia.run_all_attacks_and_create_summary(loss_train, loss_test, logits_train,
                                      logits_test, labels_train, labels_test,
                                      return_dict=True)
 ```
 Here, we now also get as output the class with the maximal vulnerability
 according to our metrics (`max_vuln_class_auc`, `max_vuln_class_advantage`)
 together with the corresponding values (`class_<CLASS>_auc`,
 `class_<CLASS>_advantage`). The same values exist in the `results` dictionary
 for `min` instead of `max`, i.e., the least vulnerable classes. Moreover, the
 gap between the maximum and minimum values (`max_class_gap_auc`,
 `max_class_gap_advantage`) is also provided. Similarly, the vulnerability
 metrics when the attacks are restricted to the misclassified examples
 (`misclassified_auc`, `misclassified_advantage`) are also shown. Finally, the
 results also contain the number of examples in each of these groups, i.e.,
 within each of the reported classes as well as the number of misclassified
 examples. The final `results` dictionary is of the form
 ```
 {'auc': 0.9181,
 'best_attacker_auc': 'all_rf_logits_loss_test_auc',
 'advantage': 0.6915,
 'best_attacker_advantage': 'all_rf_logits_loss_test_advantage',
 'max_class_gap_auc': 0.254,
 'class_5_auc': 0.9512,
 'class_3_auc': 0.6972,
 'max_vuln_class_auc': 5,
 'min_vuln_class_auc': 3,
 'max_class_gap_advantage': 0.5073,
 'class_0_advantage': 0.8086,
 'class_3_advantage': 0.3013,
 'max_vuln_class_advantage': 0,
 'min_vuln_class_advantage': 3,
 'misclassified_n_examples': 4513.0,
 'class_0_n_examples': 899.0,
 'class_1_n_examples': 900.0,
 'class_2_n_examples': 931.0,
 'class_3_n_examples': 893.0,
 'class_4_n_examples': 960.0,
 'class_5_n_examples': 884.0}
 ```
 ### Setting the precision of the reported results
 Finally, `run_all_attacks_and_create_summary` takes one extra keyword argument
 `decimals`, expecting a positive integer. This sets the precision of all result
 values as the number of decimals to report. It defaults to 4.
 ## Run all attacks and get all outputs
 With the `run_all_attacks` function, one can run all implemented attacks on all
 possible subsets of the data (all examples, split by class, split by confidence
 deciles, misclassified only). This function returns a relatively large
 dictionary with all attack results. This is the most detailed information one
 could get about these types of membership inference attacks (besides plots for
 each attack, see next section.) This is useful if you know exactly what you're
 looking for.
 > NOTE: The `run_all_attacks` function takes as an additional argument which
 > trained attackers to run. In the `run_all_attacks_and_create_summary`, only
 > logistic regression (`lr`) is trained as a binary classifier to distinguish
 > in-training form out-of-training examples. In addition, with the
 > `attack_classifiers` argument, one can add multi-layered perceptrons (`mlp`),
 > random forests (`rf`), and k-nearest-neighbors (`knn`) or any subset thereof
 > for the attack models. Note that these classifiers may not converge.
 ```python
 mia.run_all_attacks(loss_train, loss_test, logits_train, logits_test,
                labels_train, labels_test,
                attack_classifiers=('lr', 'mlp', 'rf', 'knn'))
 ```
 Again, `run_all_attacks` can be called on all combinations of losses, logits,
 probabilities, and labels as long as at least either losses or logits
 (probabilities) are provided.
 ## Fine grained control over individual attacks and plots
 The `run_attack` function exposes the underlying workhorse of the
 `run_all_attacks` and `run_all_attacks_and_create_summary` functionality. It
 allows for fine grained control of which attacks to run individually.
 As another key feature, this function also exposes options to store receiver
 operator curve plots for the different attacks as well as histograms of losses
 or the maximum logits/probabilities. Finally, we can also store all results
 (including the values to reproduce the plots) to colossus.
 All options are explained in detail in the doc string of the `run_attack`
 function.
 For example, to run a simple threshold attack on the losses only and store plots
 and result data to colossus, run
 ```python
 data_path = '/Users/user/Desktop/test/'  # set to None to not store data
 figure_path = '/Users/user/Desktop/test/' # set to None to not store figures
 mia.attack(loss_train=loss_train,
           loss_test=loss_test,
           metric='auc',
           output_directory=data_path,
           figure_directory=figure_path)
 ```
 Among other things, the `run_attack` functionality allows to control:
 *   which metrics to output (`metric` argument, using `auc` or `advantage` or
    both)
 *   which classifiers (logistic regression, multi-layered perceptrons, random
    forests) to train as attackers beyond the simple threshold attacks
    (`attack_classifiers`)
 *   to only attack a specific (set of) classes (`by_class`)
 *   to only attack specific percentiles of the data (`by_percentile`).
    Percentiles here are computed by looking at the largest logit or probability
    for each example, i.e., how confident the model is in its prediction.
 *   to only attack the misclassified examples (`only_misclassified`)
 *   not to balance examples between the in-training and out-of-training examples
    using `balance`. By default an equal number of examples from train and test
    are selected for the attacks (whichever is smaller).
 *   the test set size for trained attacks (`test_size`). When a classifier is
    trained to distinguish between train and test examples, a train-test split
    for that classifier itself is required.
 *   for the train-test split as well as for the class balancing randomness is
    used with a seed specified by `random_state`.
 ## Contact
 Reach out to tf-privacy@google.com and let us know how you’re using this module.
 We’re keen on hearing your stories, feedback, and suggestions!
 ## Copyright
 Copyright 2020 - Google LLC
--- a/tensorflow_privacy/privacy/membership_inference_attack/init.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/init.py
@ -0,0 +1,13 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
--- a/tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
@ -0,0 +1,716 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 """Code that runs membership inference attacks based on the model outputs."""
 import collections
 import io
 import os
 import re
 from typing import Text, Dict, Iterable, Tuple, Union, Any
 from absl import logging
 import numpy as np
 from scipy import special
 from tensorflow_privacy.privacy.membership_inference_attack import plotting
 from tensorflow_privacy.privacy.membership_inference_attack import trained_attack_models
 from tensorflow_privacy.privacy.membership_inference_attack import utils
 from os import mkdir
 ArrayDict = Dict[Text, np.ndarray]
 FloatDict = Dict[Text, float]
 AnyDict = Dict[Text, Any]
 Dataset = Tuple[Tuple[np.ndarray, np.ndarray], Tuple[np.ndarray, np.ndarray]]
 MetricNames = Union[Text, Iterable[Text]]
 def _get_vulnerabilities(result: ArrayDict, metrics: MetricNames) -> FloatDict:
  """Gets the vulnerabilities according to the chosen metrics for all attacks."""
  vulns = {}
  if isinstance(metrics, str):
    metrics = [metrics]
  for k in result:
    for metric in metrics:
      if k.endswith(metric.lower()) or k.endswith('n_examples'):
        vulns[k] = float(result[k])
  return vulns
 def _get_maximum_vulnerability(
    attack_result: FloatDict,
    metrics: MetricNames,
    filterby: Text = '') -> Dict[Text, Dict[Text, Union[Text, float]]]:
  """Returns the worst vulnerability according to the chosen metrics of all attacks."""
  vulns = {}
  if isinstance(metrics, str):
    metrics = [metrics]
  for metric in metrics:
    best_attack_value = -np.inf
    for k in attack_result:
      if (k.startswith(filterby.lower()) and k.endswith(metric.lower()) and
          'train' not in k):
        if float(attack_result[k]) > best_attack_value:
          best_attack_value = attack_result[k]
          best_attacker = k
    if best_attack_value > -np.inf:
      newkey = filterby + '-' + metric if filterby else metric
      vulns[newkey] = {'value': best_attack_value, 'attacker': best_attacker}
  return vulns
 def _get_maximum_class_gap_or_none(result: FloatDict,
                                   metrics: MetricNames) -> FloatDict:
  """Returns the biggest and smallest vulnerability and the gap across classes."""
  gaps = {}
  if isinstance(metrics, str):
    metrics = [metrics]
  for metric in metrics:
    hi = -np.inf
    lo = np.inf
    hi_idx, lo_idx = -1, -1
    for k in result:
      if (k.startswith('class') and k.endswith(metric.lower()) and
          'train' not in k):
        if float(result[k]) > hi:
          hi = float(result[k])
          hi_idx = int(re.findall(r'class_(\d+)_', k)[0])
        if float(result[k]) < lo:
          lo = float(result[k])
          lo_idx = int(re.findall(r'class_(\d+)_', k)[0])
    if lo - hi < np.inf:
      gaps['max_class_gap_' + metric] = hi - lo
      gaps[f'class_{hi_idx}_' + metric] = hi
      gaps[f'class_{lo_idx}_' + metric] = lo
      gaps['max_vuln_class_' + metric] = hi_idx
      gaps['min_vuln_class_' + metric] = lo_idx
  return gaps
 # ------------------------------------------------------------------------------
 #  Attacks
 # ------------------------------------------------------------------------------
 def _run_threshold_loss_attack(features: ArrayDict,
                               figure_file_prefix: Text = '',
                               figure_directory: Text = None) -> ArrayDict:
  """Runs the threshold attack on the loss."""
  logging.info('Run threshold attack on loss...')
  is_train = features['is_train']
  attack_prefix = 'thresh_loss'
  tmp_results = utils.compute_performance_metrics(is_train, -features['loss'])
  if figure_directory is not None:
    figpath = os.path.join(figure_directory,
                           figure_file_prefix + attack_prefix + '.png')
    plotting.save_plot(
        plotting.plot_curve_with_area(
            tmp_results['fpr'], tmp_results['tpr'], xlabel='FPR', ylabel='TPR'),
        figpath)
    figpath = os.path.join(figure_directory,
                           figure_file_prefix + attack_prefix + '_hist.png')
    plotting.save_plot(
        plotting.plot_histograms(
            features['loss'][is_train == 1],
            features['loss'][is_train == 0],
            xlabel='loss'), figpath)
  return utils.prepend_to_keys(tmp_results, attack_prefix + '_')
 def _run_threshold_attack_maxlogit(features: ArrayDict,
                                   figure_file_prefix: Text = '',
                                   figure_directory: Text = None) -> ArrayDict:
  """Runs the threshold attack on the maximum logit."""
  is_train = features['is_train']
  preds = np.max(features['logits'], axis=-1)
  tmp_results = utils.compute_performance_metrics(is_train, preds)
  attack_prefix = 'thresh_maxlogit'
  if figure_directory is not None:
    figpath = os.path.join(figure_directory,
                           figure_file_prefix + attack_prefix + '.png')
    plotting.save_plot(
        plotting.plot_curve_with_area(
            tmp_results['fpr'], tmp_results['tpr'], xlabel='FPR', ylabel='TPR'),
        figpath)
    figpath = os.path.join(figure_directory,
                           figure_file_prefix + attack_prefix + '_hist.png')
    plotting.save_plot(
        plotting.plot_histograms(
            preds[is_train == 1], preds[is_train == 0], xlabel='loss'), figpath)
  return utils.prepend_to_keys(tmp_results, attack_prefix + '_')
 def _run_trained_attack(attack_classifier: Text,
                        data: Dataset,
                        attack_prefix: Text,
                        figure_file_prefix: Text = '',
                        figure_directory: Text = None) -> ArrayDict:
  """Train a classifier for attack and evaluate it."""
  # Train the attack classifier
  (x_train, y_train), (x_test, y_test) = data
  clf_model = trained_attack_models.choose_model(attack_classifier)
  clf_model.fit(x_train, y_train)
  # Calculate training set metrics
  pred_train = clf_model.predict_proba(x_train)[:, clf_model.classes_ == 1]
  results = utils.prepend_to_keys(
      utils.compute_performance_metrics(y_train, pred_train),
      attack_prefix + 'train_')
  # Calculate test set metrics
  pred_test = clf_model.predict_proba(x_test)[:, clf_model.classes_ == 1]
  results.update(
      utils.prepend_to_keys(
          utils.compute_performance_metrics(y_test, pred_test),
          attack_prefix + 'test_'))
  if figure_directory is not None:
    figpath = os.path.join(figure_directory,
                           figure_file_prefix + attack_prefix[:-1] + '.png')
    plotting.save_plot(
        plotting.plot_curve_with_area(
            results[attack_prefix + 'test_fpr'],
            results[attack_prefix + 'test_tpr'],
            xlabel='FPR',
            ylabel='TPR'), figpath)
  return results
 def _run_attacks_and_plot(features: ArrayDict,
                          attacks: Iterable[Text],
                          attack_classifiers: Iterable[Text],
                          balance: bool,
                          test_size: float,
                          random_state: int,
                          figure_file_prefix: Text = '',
                          figure_directory: Text = None) -> ArrayDict:
  """Runs the specified attacks on the provided data."""
  if balance:
    try:
      features = utils.subsample_to_balance(features, random_state)
    except RuntimeError:
      logging.info('Not enough remaining data for attack: Empty results.')
      return {}
  result = {}
  # -------------------- Simple threshold attacks
  if 'thresh_loss' in attacks:
    result.update(
        _run_threshold_loss_attack(features, figure_file_prefix,
                                   figure_directory))
  if 'thresh_maxlogit' in attacks:
    result.update(
        _run_threshold_attack_maxlogit(features, figure_file_prefix,
                                       figure_directory))
  # -------------------- Run learned attacks
  # TODO(b/157632603): Add a prefix (for example 'trained_') for attacks which
  # use classifiers to distinguish from threshould attacks.
  if 'logits' in attacks:
    data = utils.get_train_test_split(
        features, add_loss=False, test_size=test_size)
    for clf in attack_classifiers:
      logging.info('Train %s on %d logits', clf, data[0][0].shape[1])
      attack_prefix = f'{clf}_logits_'
      result.update(
          _run_trained_attack(clf, data, attack_prefix, figure_file_prefix,
                              figure_directory))
    if 'logits_loss' in attacks:
      data = utils.get_train_test_split(
          features, add_loss=True, test_size=test_size)
      for clf in attack_classifiers:
        logging.info('Train %s on %d logits + loss', clf, data[0][0].shape[1])
        attack_prefix = f'{clf}_logits_loss_'
        result.update(
            _run_trained_attack(clf, data, attack_prefix, figure_file_prefix,
                                figure_directory))
  return result
 def run_attack(loss_train: np.ndarray = None,
               loss_test: np.ndarray = None,
               logits_train: np.ndarray = None,
               logits_test: np.ndarray = None,
               labels_train: np.ndarray = None,
               labels_test: np.ndarray = None,
               attack_classifiers: Iterable[Text] = None,
               only_misclassified: bool = False,
               by_class: Union[bool, Iterable[int], int] = False,
               by_percentile: Union[bool, Iterable[int], int] = False,
               figure_directory: Text = None,
               output_directory: Text = None,
               metric: MetricNames = 'auc',
               balance: bool = True,
               test_size: float = 0.2,
               random_state: int = 0) -> FloatDict:
  """Run membership inference attack(s).
  Based only on specific outputs of a machine learning model on some examples
  used for training (train) and some examples not used for training (test), run
  membership inference attacks that try to discriminate training from test
  inputs based only on the model outputs.
  While all inputs are optional, at least one train/test pair is required to run
  any attacks (either losses or logits/probabilities).
  Note that one can equally provide output probabilities instead of logits in
  the logits_train / logits_test arguments.
  We measure the vulnerability of the model via the area under the ROC-curve
  (auc) or via max |fpr - tpr| (advantage) of the attack classifier. These
  measures are very closely related and may look almost indistinguishable.
  This function provides relatively fine grained control and outputs detailed
  results. For a higher-level wrapper with sane internal default settings and
  distilled output results, see `run_all_attacks`.
  Via the `figure_directory` argument and the `output_directory` argument more
  detailed information as well as roc-curve plots can optionally be stored to
  disk.
  If `loss_train` and `loss_test` are provided we run:
    - simple threshold attack on the loss
  If `logits_train` and `logits_test` are provided we run:
    - simple threshold attack on the top logit
    - if `attack_classifiers` is not None and no losses are provided: train the
       specified classifiers on the top 10 logits (or all logits if there are
       less than 10)
    - if `attack_classifiers` is not None and losses are provided: train the
       specified classifiers on the top 10 logits (or all logits if there are
       less than 10) and the loss
  Args:
    loss_train: A 1D array containing the individual scalar losses for examples
      used during training.
    loss_test: A 1D array containing the individual scalar losses for examples
      not used during training.
    logits_train: A 2D array (n_train, n_classes) of the individual logits or
      output probabilities of examples used during training.
    logits_test: A 2D array (n_test, n_classes) of the individual logits or
      output probabilities of examples not used during training.
    labels_train: The true labels of the training examples. Labels are only
      needed when `by_class` is specified (i.e., not False).
    labels_test: The true labels of the test examples. Labels are only needed
      when `by_class` is specified (i.e., not False).
    attack_classifiers: Attack classifiers to train beyond simple thresholding
      that require training a simple binary ML classifier. This argument is
      ignored if logits are not provided. Classifiers can be 'lr' for logistic
      regression, 'mlp' for multi-layered perceptron, 'rf' for random forests,
      or 'knn' for k-nearest-neighbors. If 'None', don't train classifiers
      beyond simple thresholding.
    only_misclassified: Run and evaluate attacks only on misclassified examples.
      Must specify `labels_train`, `labels_test`, `logits_train` and
      `logits_test` to use this. If this is True, `by_class` and `by_percentile`
      are ignored.
    by_class: This argument determines whether attacks are run on the entire
      data, or on examples grouped by their class label. If `True`, all attacks
      are run separately for each class. If `by_class` is a single integer, run
      attacks for this class only. If `by_class` is an iterable of integers, run
      all attacks for each of the specified class labels separately. Only used
      if `labels_train` and `labels_test` are specified. If `by_class` is
      specified (not False), `by_percentile` is ignored. Ignored if
      `only_misclassified` is True.
    by_percentile: This argument determines whether attacks are run on the
      entire data, or separately for examples where the most likely class
      prediction is within a given percentile of all maximum predicitons. If
      `True`, all attacks are run separately for the examples with max
      probabilities within the ten deciles. If `by_precentile` is a single int
      between 0 and 100, run attacks only for examples with confidence within
      this percentile. If `by_percentile` is an iterable of ints between 0 and
      100, run all attacks for each of the specified percentiles separately.
      Ignored if `by_class` is specified. Ignored if `logits_train` and
      `logits_test` are not specified. Ignored if `only_misclassified` is True.
    figure_directory: Where to store ROC-curve plots and histograms. If `None`,
      don't create plots.
    output_directory: Where to store detailed result data for all run attacks.
      If `None`, don't store detailed result data.
    metric: Available vulnerability metrics are 'auc' or 'advantage' for the
      area under the ROC curve or the advantage (max |tpr - fpr|). Specify
      either one of them or both.
    balance: Whether to use the same number of train and test samples (by
      randomly subsampling whichever happens to be larger).
    test_size: The fraction of the input data to use for the evaluation of
      trained ML attacks. This argument is ignored, if either attack_classifiers
      is None, or no logits are provided.
    random_state: Random seed for reproducibility. Only used if attack models
      are trained.
  Returns:
    results: Dictionary with the chosen vulnerability metric(s) for all ran
      attacks.
  """
  attacks = []
  features = {}
  # ---------- Check available data ----------
  if ((loss_train is None or loss_test is None) and
      (logits_train is None or logits_test is None)):
    raise ValueError(
        'Need at least train and test for loss or train and test for logits.')
  # ---------- If losses are provided ----------
  if loss_train is not None and loss_test is not None:
    if loss_train.ndim != 1 or loss_test.ndim != 1:
      raise ValueError('Losses must be 1D arrays.')
    features['is_train'] = np.concatenate(
        (np.ones(len(loss_train)), np.zeros(len(loss_test))),
        axis=0).astype(int)
    features['loss'] = np.concatenate((loss_train.ravel(), loss_test.ravel()),
                                      axis=0)
    attacks.append('thresh_loss')
  # ---------- If logits are provided ----------
  if logits_train is not None and logits_test is not None:
    assert logits_train.ndim == 2 and  logits_test.ndim == 2, \
        'Logits must be 2D arrays.'
    assert logits_train.shape[1] == logits_test.shape[1], \
        'Train and test logits must agree along axis 1 (number of classes).'
    if 'is_train' in features:
      assert (loss_train.shape[0] == logits_train.shape[0] and
              loss_test.shape[0] == logits_test.shape[0]), \
          'Number of examples must match between loss and logits.'
    else:
      features['is_train'] = np.concatenate(
          (np.ones(logits_train.shape[0]), np.zeros(logits_test.shape[0])),
          axis=0).astype(int)
    attacks.append('thresh_maxlogit')
    features['logits'] = np.concatenate((logits_train, logits_test), axis=0)
    if attack_classifiers:
      attacks.append('logits')
      if 'loss' in features:
        attacks.append('logits_loss')
  # ---------- If labels are provided ----------
  if labels_train is not None and labels_test is not None:
    if labels_train.ndim != 1 or labels_test.ndim != 1:
      raise ValueError('Losses must be 1D arrays.')
    if 'loss' in features:
      assert (loss_train.shape[0] == labels_train.shape[0] and
              loss_test.shape[0] == labels_test.shape[0]), \
          'Number of examples must match between loss and labels.'
    else:
      assert (logits_train.shape[0] == labels_train.shape[0] and
              logits_test.shape[0] == labels_test.shape[0]), \
          'Number of examples must match between logits and labels.'
    features['label'] = np.concatenate((labels_train, labels_test), axis=0)
  # ---------- Data subsampling or filtering ----------
  filtertype = None
  filtervals = [None]
  if only_misclassified:
    if (labels_train is None or labels_test is None or logits_train is None or
        logits_test is None):
      raise ValueError('Must specify labels_train, labels_test, logits_train, '
                       'and logits_test for the only_misclassified option.')
    filtertype = 'misclassified'
  elif by_class:
    if labels_train is None or labels_test is None:
      raise ValueError('Must specify labels_train and labels_test when using '
                       'the by_class option.')
    if isinstance(by_class, bool):
      filtervals = list(set(labels_train) | set(labels_test))
    elif isinstance(by_class, int):
      filtervals = [by_class]
    elif isinstance(by_class, collections.Iterable):
      filtervals = list(by_class)
    filtertype = 'class'
  elif by_percentile:
    if logits_train is None or logits_test is None:
      raise ValueError('Must specify logits_train and logits_test when using '
                       'the by_percentile option.')
    if isinstance(by_percentile, bool):
      filtervals = list(range(10, 101, 10))
    elif isinstance(by_percentile, int):
      filtervals = [by_percentile]
    elif isinstance(by_percentile, collections.Iterable):
      filtervals = [int(percentile) for percentile in by_percentile]
    filtertype = 'percentile'
  # ---------- Need to create figure directory? ----------
  if figure_directory is not None:
    mkdir(figure_directory)
  # ---------- Actually run attacks and plot if required ----------
  logging.info('Selecting %s with values %s', filtertype, filtervals)
  num = None
  result = {}
  for filterval in filtervals:
    if filtertype is None:
      tmp_features = features
    elif filtertype == 'misclassified':
      idx = features['label'] != np.argmax(features['logits'], axis=-1)
      tmp_features = utils.select_indices(features, idx)
      num = np.sum(idx)
    elif filtertype == 'class':
      idx = features['label'] == filterval
      tmp_features = utils.select_indices(features, idx)
      num = np.sum(idx)
    elif filtertype == 'percentile':
      certainty = np.max(special.softmax(features['logits'], axis=-1), axis=-1)
      idx = certainty <= np.percentile(certainty, filterval)
      tmp_features = utils.select_indices(features, idx)
    prefix = f'{filtertype}_' if filtertype is not None else ''
    prefix += f'{filterval}_' if filterval is not None else ''
    tmp_result = _run_attacks_and_plot(tmp_features, attacks,
                                       attack_classifiers, balance, test_size,
                                       random_state, prefix, figure_directory)
    if num is not None:
      tmp_result['n_examples'] = float(num)
    if tmp_result:
      result.update(utils.prepend_to_keys(tmp_result, prefix))
  # ---------- Store data ----------
  if output_directory is not None:
    mkdir(output_directory)
    resultpath = os.path.join(output_directory, 'attack_results.npz')
    logging.info('Store aggregate results at %s.', resultpath)
    with open(resultpath, 'wb') as fp:
      io_buffer = io.BytesIO()
      np.savez(io_buffer, **result)
      fp.write(io_buffer.getvalue())
  return _get_vulnerabilities(result, metric)
 def run_all_attacks(loss_train: np.ndarray = None,
                    loss_test: np.ndarray = None,
                    logits_train: np.ndarray = None,
                    logits_test: np.ndarray = None,
                    labels_train: np.ndarray = None,
                    labels_test: np.ndarray = None,
                    attack_classifiers: Iterable[Text] = ('lr', 'mlp', 'rf',
                                                          'knn'),
                    decimals: Union[int, None] = 4) -> FloatDict:
  """Runs all possible membership inference attacks.
  Check 'run_attack' for detailed information of how attacks are performed
  and evaluated.
  This function internally chooses sane default settings for all attacks and
  returns all possible output combinations.
  For fine grained control and partial attacks, please see `run_attack`.
  Args:
    loss_train: A 1D array containing the individual scalar losses for examples
      used during training.
    loss_test: A 1D array containing the individual scalar losses for examples
      not used during training.
    logits_train: A 2D array (n_train, n_classes) of the individual logits or
      output probabilities of examples used during training.
    logits_test: A 2D array (n_test, n_classes) of the individual logits or
      output probabilities of examples not used during training.
    labels_train: The true labels of the training examples. Labels are only
      needed when `by_class` is specified (i.e., not False).
    labels_test: The true labels of the test examples. Labels are only needed
      when `by_class` is specified (i.e., not False).
    attack_classifiers: Which binary classifiers to train (in addition to simple
      threshold attacks). This can include 'lr' (logistic regression), 'mlp'
      (multi-layered perceptron), 'rf' (random forests), 'knn' (k-nearest
      neighbors), which will be trained with cross validation to determine good
      hyperparameters.
    decimals: Round all float results to this number of decimals. If decimals is
      None, don't round.
  Returns:
    result: dictionary with all attack results
  """
  metrics = ['auc', 'advantage']
  # Entire data
  result = run_attack(
      loss_train,
      loss_test,
      logits_train,
      logits_test,
      attack_classifiers=attack_classifiers,
      metric=metrics)
  result = utils.prepend_to_keys(result, 'all_')
  # Misclassified examples
  if (labels_train is not None and labels_test is not None and
      logits_train is not None and logits_test is not None):
    result.update(
        run_attack(
            loss_train,
            loss_test,
            logits_train,
            logits_test,
            labels_train,
            labels_test,
            attack_classifiers=attack_classifiers,
            only_misclassified=True,
            metric=metrics))
  # Split per class
  if labels_train is not None and labels_test is not None:
    result.update(
        run_attack(
            loss_train,
            loss_test,
            logits_train,
            logits_test,
            labels_train,
            labels_test,
            by_class=True,
            attack_classifiers=attack_classifiers,
            metric=metrics))
  # Different deciles
  if logits_train is not None and logits_test is not None:
    result.update(
        run_attack(
            loss_train,
            loss_test,
            logits_train,
            logits_test,
            by_percentile=True,
            attack_classifiers=attack_classifiers,
            metric=metrics))
  if decimals is not None:
    result = {k: round(v, decimals) for k, v in result.items()}
  return result
 def run_all_attacks_and_create_summary(
    loss_train: np.ndarray = None,
    loss_test: np.ndarray = None,
    logits_train: np.ndarray = None,
    logits_test: np.ndarray = None,
    labels_train: np.ndarray = None,
    labels_test: np.ndarray = None,
    return_dict: bool = True,
    decimals: Union[int, None] = 4) -> Union[Text, Tuple[Text, AnyDict]]:
  """Runs all possible membership inference attack(s) and distill results.
  Check 'run_attack' for detailed information of how attacks are performed
  and evaluated.
  This function internally chooses sane default settings for all attacks and
  returns all possible output combinations.
  For fine grained control and partial attacks, please see `run_attack`.
  Args:
    loss_train: A 1D array containing the individual scalar losses for examples
      used during training.
    loss_test: A 1D array containing the individual scalar losses for examples
      not used during training.
    logits_train: A 2D array (n_train, n_classes) of the individual logits or
      output probabilities of examples used during training.
    logits_test: A 2D array (n_test, n_classes) of the individual logits or
      output probabilities of examples not used during training.
    labels_train: The true labels of the training examples. Labels are only
      needed when `by_class` is specified (i.e., not False).
    labels_test: The true labels of the test examples. Labels are only needed
      when `by_class` is specified (i.e., not False).
    return_dict: Whether to also return a dictionary with the results summarized
      in the summary string.
    decimals: Round all float results to this number of decimals. If decimals is
      None, don't round.
  Returns:
    summarystring: A string with natural language summary of the attacks. In the
      summary string printed numbers will be rounded to `decimal` decimals if
      provided, otherwise will round to 3 diits by default for readability.
    result: a dictionary with all the distilled attack information summarized
      in the summarystring
  """
  summary = []
  metrics = ['auc', 'advantage']
  attack_classifiers = ['lr', 'rf', 'mlp', 'knn']
  results = run_all_attacks(
      loss_train,
      loss_test,
      logits_train,
      logits_test,
      labels_train,
      labels_test,
      attack_classifiers=attack_classifiers,
      decimals=None)
  output = _get_maximum_vulnerability(results, metrics, filterby='all')
  if decimals is not None:
    strdec = decimals
  else:
    strdec = 4
  for metric in metrics:
    summary.append(f'========== {metric.upper()} ==========')
    best_value = output['all-' + metric]['value']
    best_attacker = output['all-' + metric]['attacker']
    summary.append(f'The best attack ({best_attacker}) achieved an {metric} of '
                   f'{best_value:.{strdec}f}.')
    summary.append('')
  classgap = _get_maximum_class_gap_or_none(results, metrics)
  if classgap:
    output.update(classgap)
    for metric in metrics:
      summary.append(f'========== {metric.upper()} per class ==========')
      hi_idx = output[f'max_vuln_class_{metric}']
      lo_idx = output[f'min_vuln_class_{metric}']
      hi = output[f'class_{hi_idx}_{metric}']
      lo = output[f'class_{lo_idx}_{metric}']
      gap = output[f'max_class_gap_{metric}']
      summary.append(f'The most vulnerable class {hi_idx} has {metric} of '
                     f'{hi:.{strdec}f}.')
      summary.append(f'The least vulnerable class {lo_idx} has {metric} of '
                     f'{lo:.{strdec}f}.')
      summary.append(f'=> The maximum gap between class vulnerabilities is '
                     f'{gap:.{strdec}f}.')
      summary.append('')
  misclassified = _get_maximum_vulnerability(
      results, metrics, filterby='misclassified')
  if misclassified:
    for metric in metrics:
      best_attacker = misclassified['misclassified-' + metric]['attacker']
      summary.append(f'========== {metric.upper()} for misclassified '
                     '==========')
      summary.append('Among misclassified examples, the best attack '
                     f'({best_attacker}) achieved an {metric} of '
                     f'{best_value:.{strdec}f}.')
      summary.append('')
    output.update(misclassified)
  n_examples = {k: v for k, v in results.items() if k.endswith('n_examples')}
  if n_examples:
    output.update(n_examples)
  # Flatten remaining dicts in output
  fresh_output = {}
  for k, v in output.items():
    if isinstance(v, dict):
      if k.startswith('all'):
        fresh_output[k[4:]] = v['value']
        fresh_output['best_attacker_' + k[4:]] = v['attacker']
    else:
      fresh_output[k] = v
  output = fresh_output
  if decimals is not None:
    for k, v in output.items():
      if isinstance(v, float):
        output[k] = round(v, decimals)
  summary = '\n'.join(summary)
  if return_dict:
    return summary, output
  else:
    return summary
--- a/tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack_test.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack_test.py
@ -0,0 +1,307 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 """Tests for tensorflow_privacy.privacy.membership_inference_attack.utils."""
 from absl.testing import absltest
 import numpy as np
 from tensorflow_privacy.privacy.membership_inference_attack import membership_inference_attack as mia
 def get_result_dict():
  """Get an example result dictionary."""
  return {
      'test_n_examples': np.ones(1),
      'test_examples': np.zeros(1),
      'test_auc': np.ones(1),
      'test_advantage': np.ones(1),
      'all_0-metric': np.array([1]),
      'all_1-metric': np.array([2]),
      'test_2-metric': np.array([3]),
      'test_score': np.array([4]),
  }
 def get_test_inputs():
  """Get example inputs for attacks."""
  n_train = n_test = 500
  rng = np.random.RandomState(4)
  loss_train = rng.randn(n_train) - 0.4
  loss_test = rng.randn(n_test) + 0.4
  logits_train = rng.randn(n_train, 5) + 0.2
  logits_test = rng.randn(n_test, 5) - 0.2
  labels_train = np.array([i % 5 for i in range(n_train)])
  labels_test = np.array([(3 * i) % 5 for i in range(n_test)])
  return (loss_train, loss_test, logits_train, logits_test,
          labels_train, labels_test)
 class GetVulnerabilityTest(absltest.TestCase):
  def test_get_vulnerabilities(self):
    """Test extraction of vulnerability scores."""
    testdict = get_result_dict()
    for key in ['auc', 'advantage']:
      res = mia._get_vulnerabilities(testdict, key)
      self.assertLen(res, 2)
      self.assertEqual(res[f'test_{key}'], 1)
      self.assertEqual(res['test_n_examples'], 1)
    res = mia._get_vulnerabilities(testdict, ['auc', 'advantage'])
    self.assertLen(res, 3)
    self.assertEqual(res['test_auc'], 1)
    self.assertEqual(res['test_advantage'], 1)
    self.assertEqual(res['test_n_examples'], 1)
 class GetMaximumVulnerabilityTest(absltest.TestCase):
  def test_get_maximum_vulnerability(self):
    """Test extraction of maximum vulnerability score."""
    testdict = get_result_dict()
    for i in range(3):
      key = f'{i}-metric'
      res = mia._get_maximum_vulnerability(testdict, key)
      self.assertLen(res, 1)
      self.assertEqual(res[key]['value'], i + 1)
      if i < 2:
        self.assertEqual(res[key]['attacker'], f'all_{i}-metric')
      else:
        self.assertEqual(res[key]['attacker'], 'test_2-metric')
    res = mia._get_maximum_vulnerability(testdict, 'metric')
    self.assertLen(res, 1)
    self.assertEqual(res['metric']['value'], 3)
    res = mia._get_maximum_vulnerability(testdict, ['metric'],
                                         filterby='all')
    self.assertLen(res, 1)
    self.assertEqual(res['all-metric']['value'], 2)
    res = mia._get_maximum_vulnerability(testdict, ['metric', 'score'])
    self.assertLen(res, 2)
    self.assertEqual(res['metric']['value'], 3)
    self.assertEqual(res['score']['value'], 4)
    self.assertEqual(res['score']['attacker'], 'test_score')
 class ThresholdAttackLossTest(absltest.TestCase):
  def test_threshold_attack_loss(self):
    """Test simple threshold attack on loss."""
    features = {
        'loss': np.zeros(10),
        'is_train': np.concatenate((np.zeros(5), np.ones(5))),
    }
    res = mia._run_threshold_loss_attack(features)
    for k in res:
      self.assertStartsWith(k, 'thresh_loss')
    self.assertEqual(res['thresh_loss_auc'], 0.5)
    self.assertEqual(res['thresh_loss_advantage'], 0.0)
    rng = np.random.RandomState(4)
    n_train = 1000
    n_test = 500
    loss_train = rng.randn(n_train) - 0.4
    loss_test = rng.randn(n_test) + 0.4
    features = {
        'loss': np.concatenate((loss_train, loss_test)),
        'is_train': np.concatenate((np.ones(n_train), np.zeros(n_test))),
    }
    res = mia._run_threshold_loss_attack(features)
    self.assertBetween(res['thresh_loss_auc'], 0.7, 0.75)
    self.assertBetween(res['thresh_loss_advantage'], 0.3, 0.35)
 class ThresholdAttackMaxlogitTest(absltest.TestCase):
  def test_threshold_attack_maxlogits(self):
    """Test simple threshold attack on maximum logit."""
    features = {
        'logits': np.eye(10, 14),
        'is_train': np.concatenate((np.zeros(5), np.ones(5))),
    }
    res = mia._run_threshold_attack_maxlogit(features)
    for k in res:
      self.assertStartsWith(k, 'thresh_maxlogit')
    self.assertEqual(res['thresh_maxlogit_auc'], 0.5)
    self.assertEqual(res['thresh_maxlogit_advantage'], 0.0)
    rng = np.random.RandomState(4)
    n_train = 1000
    n_test = 500
    logits_train = rng.randn(n_train, 12) + 0.2
    logits_test = rng.randn(n_test, 12) - 0.2
    features = {
        'logits': np.concatenate((logits_train, logits_test), axis=0),
        'is_train': np.concatenate((np.ones(n_train), np.zeros(n_test))),
    }
    res = mia._run_threshold_attack_maxlogit(features)
    self.assertBetween(res['thresh_maxlogit_auc'], 0.7, 0.75)
    self.assertBetween(res['thresh_maxlogit_advantage'], 0.3, 0.35)
 class TrainedAttackTrivialTest(absltest.TestCase):
  def test_trained_attack(self):
    """Test trained attacks."""
    # Trivially easy problem
    x_train, x_test = np.ones((500, 3)), np.ones((20, 3))
    x_train[:200] *= -1
    x_test[:8] *= -1
    y_train, y_test = np.ones(500).astype(int), np.ones(20).astype(int)
    y_train[:200] = 0
    y_test[:8] = 0
    data = (x_train, y_train), (x_test, y_test)
    for clf in ['lr', 'rf', 'mlp', 'knn']:
      res = mia._run_trained_attack(clf, data, attack_prefix='a-')
      self.assertEqual(res['a-train_auc'], 1)
      self.assertEqual(res['a-test_auc'], 1)
      self.assertEqual(res['a-train_advantage'], 1)
      self.assertEqual(res['a-test_advantage'], 1)
 class TrainedAttackRandomFeaturesTest(absltest.TestCase):
  def test_trained_attack(self):
    """Test trained attacks."""
    # Random labels and features
    rng = np.random.RandomState(4)
    x_train, x_test = rng.randn(500, 3), rng.randn(500, 3)
    y_train = rng.binomial(1, 0.5, size=(500,))
    y_test = rng.binomial(1, 0.5, size=(500,))
    data = (x_train, y_train), (x_test, y_test)
    for clf in ['lr', 'rf', 'mlp', 'knn']:
      res = mia._run_trained_attack(clf, data, attack_prefix='a-')
      self.assertBetween(res['a-train_auc'], 0.5, 1.)
      self.assertBetween(res['a-test_auc'], 0.4, 0.6)
      self.assertBetween(res['a-train_advantage'], 0., 1.0)
      self.assertBetween(res['a-test_advantage'], 0., 0.2)
 class AttackLossesTest(absltest.TestCase):
  def test_attack(self):
    """Test individual attack function."""
    # losses only, both metrics
    loss_train, loss_test, _, _, _, _ = get_test_inputs()
    res = mia.run_attack(loss_train, loss_test, metric=('auc', 'advantage'))
    self.assertBetween(res['thresh_loss_auc'], 0.7, 0.75)
    self.assertBetween(res['thresh_loss_advantage'], 0.3, 0.35)
 class AttackLossesLogitsTest(absltest.TestCase):
  def test_attack(self):
    """Test individual attack function."""
    # losses and logits, two classifiers, single metric
    loss_train, loss_test, logits_train, logits_test, _, _ = get_test_inputs()
    res = mia.run_attack(
        loss_train,
        loss_test,
        logits_train,
        logits_test,
        attack_classifiers=('rf', 'knn'),
        metric='auc')
    self.assertBetween(res['rf_logits_test_auc'], 0.7, 0.9)
    self.assertBetween(res['knn_logits_test_auc'], 0.7, 0.9)
    self.assertBetween(res['rf_logits_loss_test_auc'], 0.7, 0.9)
    self.assertBetween(res['knn_logits_loss_test_auc'], 0.7, 0.9)
 class AttackLossesLabelsByClassTest(absltest.TestCase):
  def test_attack(self):
    # losses and labels, single metric, split by class
    loss_train, loss_test, _, _, labels_train, labels_test = get_test_inputs()
    n_train = loss_train.shape[0]
    n_test = loss_test.shape[0]
    res = mia.run_attack(
        loss_train,
        loss_test,
        labels_train=labels_train,
        labels_test=labels_test,
        by_class=True,
        metric='auc')
    self.assertLen(res, 10)
    for k in res:
      self.assertStartsWith(k, 'class_')
      if k.endswith('n_examples'):
        self.assertEqual(int(res[k]), (n_train + n_test) // 5)
      else:
        self.assertBetween(res[k], 0.65, 0.75)
 class AttackLossesLabelsSingleClassTest(absltest.TestCase):
  def test_attack(self):
    # losses and labels, both metrics, single class
    loss_train, loss_test, _, _, labels_train, labels_test = get_test_inputs()
    n_train = loss_train.shape[0]
    n_test = loss_test.shape[0]
    res = mia.run_attack(
        loss_train,
        loss_test,
        labels_train=labels_train,
        labels_test=labels_test,
        by_class=2,
        metric=('auc', 'advantage'))
    self.assertLen(res, 3)
    for k in res:
      self.assertStartsWith(k, 'class_2')
      if k.endswith('n_examples'):
        self.assertEqual(int(res[k]), (n_train + n_test) // 5)
      elif k.endswith('advantage'):
        self.assertBetween(res[k], 0.3, 0.5)
      elif k.endswith('auc'):
        self.assertBetween(res[k], 0.7, 0.75)
 class AttackLogitsLabelsMisclassifiedTest(absltest.TestCase):
  def test_attack(self):
    # logits and labels, single metric, single classifier, misclassified only
    (_, _, logits_train, logits_test,
     labels_train, labels_test) = get_test_inputs()
    res = mia.run_attack(
        logits_train=logits_train,
        logits_test=logits_test,
        labels_train=labels_train,
        labels_test=labels_test,
        only_misclassified=True,
        attack_classifiers=('lr',),
        metric='advantage')
    self.assertBetween(res['misclassified_lr_logits_test_advantage'], 0.3, 0.8)
    self.assertEqual(res['misclassified_n_examples'], 802)
 class AttackLogitsByPrecentileTest(absltest.TestCase):
  def test_attack(self):
    # only logits, single metric, no classifiers, split by deciles
    _, _, logits_train, logits_test, _, _ = get_test_inputs()
    res = mia.run_attack(
        logits_train=logits_train,
        logits_test=logits_test,
        by_percentile=True,
        metric='auc')
    for k in res:
      self.assertStartsWith(k, 'percentile')
      self.assertBetween(res[k], 0.60, 0.75)
 if __name__ == '__main__':
  absltest.main()
--- a/tensorflow_privacy/privacy/membership_inference_attack/plotting.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/plotting.py
@ -0,0 +1,80 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 """Plotting functionality for membership inference attack analysis.
 Functions to plot ROC curves and histograms as well as functionality to store
 figures to colossus.
 """
 from typing import Text, Iterable
 import matplotlib.pyplot as plt
 import numpy as np
 from sklearn import metrics
 def save_plot(figure: plt.Figure, path: Text, outformat='png'):
  """Store a figure to disk."""
  if path is not None:
    with open(path, 'wb') as f:
      figure.savefig(f, bbox_inches='tight', format=outformat)
    plt.close(figure)
 def plot_curve_with_area(x: Iterable[float],
                         y: Iterable[float],
                         xlabel: Text = 'x',
                         ylabel: Text = 'y') -> plt.Figure:
  """Plot the curve defined by inputs and the area under the curve.
  All entries of x and y are required to lie between 0 and 1.
  For example, x could be recall and y precision, or x is fpr and y is tpr.
  Args:
    x: Values on x-axis (1d)
    y: Values on y-axis (must be same length as x)
    xlabel: Label for x axis
    ylabel: Label for y axis
  Returns:
    The matplotlib figure handle
  """
  fig = plt.figure()
  plt.plot([0, 1], [0, 1], 'k', lw=1.0)
  plt.plot(x, y, lw=2, label=f'AUC: {metrics.auc(x, y):.3f}')
  plt.xlabel(xlabel)
  plt.ylabel(ylabel)
  plt.legend()
  return fig
 def plot_histograms(train: Iterable[float],
                    test: Iterable[float],
                    xlabel: Text = 'x',
                    thresh: float = None) -> plt.Figure:
  """Plot histograms of training versus test metrics."""
  xmin = min(np.min(train), np.min(test))
  xmax = max(np.max(train), np.max(test))
  bins = np.linspace(xmin, xmax, 100)
  fig = plt.figure()
  plt.hist(test, bins=bins, density=True, alpha=0.5, label='test', log='y')
  plt.hist(train, bins=bins, density=True, alpha=0.5, label='train', log='y')
  if thresh is not None:
    plt.axvline(thresh, c='r', label=f'threshold = {thresh:.3f}')
  plt.xlabel(xlabel)
  plt.ylabel('normalized counts (density)')
  plt.legend()
  return fig
--- a/tensorflow_privacy/privacy/membership_inference_attack/run_attack.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/run_attack.py
@ -0,0 +1,217 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 r"""This module contains code to run attacks on previous model outputs.
 Provided a path to a dataset of model outputs (logits, output probabilities,
 losses, labels, predictions, membership indicators (is train or not)), we train
 supervised binary classifiers using variable sets of features to distinguish
 training from testing examples. We also provide threshold attacks, i.e., simply
 thresholing losses or the maximum probability/logit to obtain binary
 predictions.
 The input data is assumed to be a tf.example proto stored with RecordIO (.rio).
 For example, outputs in an accepted format are typically produced by the
 `extract` script in the `extract` directory.
 We run various attacks on each of the full datasets, split by class, split by
 percentile of the most certain prediction and only on misclassified examples and
 record the area under the receiver operator curve as well as the attack
 advantage (i.e., max |tpr - fpr|) as vulnerability metrics. For all metrics
 recorded, see the doc string of `membership_inference_attack.all_attacks`.
 In addition, we record the overall training and test accuracy and loss of the
 original image classifier. All these results are collected in a single
 dictionary with descriptive keys. If there exist multiple model checkpoints (at
 different training epochs), the results for each checkpoint are concatenated,
 such that the dictionary keys stay the same, but the values contain arrays (the
 size being the number of checkpoints). This overall result dicitonary is then
 stored as a binary (and compressed) numpy file: .npz.
 This file is stored either in the provided output path. If that is the empty
 string, it is stored on the same level as the inputdir with the chosen name.
 Using `attack_results.npz` by default.
 """
 # Example usage:
 python run_attack.py --dataset=cifar10 --inputdir="attack_data"
 The results are then stored at ./attack_data
 import io
 import os
 import re
 from typing import Text, Dict
 from absl import app
 from absl import flags
 from absl import logging
 import numpy as np
 import tensorflow.google as tf
 import tensorflow_datasets as tfds
 from tensorflow_privacy.privacy.membership_inference_attack import membership_inference_attack as mia
 from tensorflow_privacy.privacy.membership_inference_attack import utils
 from glob import glob
 Result = Dict[Text, np.ndarray]
 FLAGS = flags.FLAGS
 flags.DEFINE_float('test_size', 0.2,
                   'Fraction of attack data used for the test set.')
 flags.DEFINE_string('dataset', 'cifar10', 'The dataset to use.')
 flags.DEFINE_string(
    'output', '', 'The path where to store the results. '
    'If empty string, store on same level as `inputdir` using '
    'the name specified in the result_name flag.')
 flags.DEFINE_string('result_name', 'attack_results.npz',
                    'The name of the output npz file with the attack results.')
 flags.DEFINE_string(
    'inputdir',
    'attack_data',
    'The input directory containing the attack datasets.')
 flags.DEFINE_integer('seed', 43, 'Random seed to ensure same data splits.')
 # ------------------------------------------------------------------------------
 #  Load and select features for attacks
 # ------------------------------------------------------------------------------
 def load_all_features(data_path: Text) -> Result:
  """Extract the selected features from a given dataset."""
  if FLAGS.dataset == 'cifar100':
    num_classes = 100
  elif FLAGS.dataset in ['cifar10', 'mnist']:
    num_classes = 10
  else:
    raise ValueError(f'Unknown dataset {FLAGS.dataset}')
  features = {
      'logits': tf.FixedLenFeature((num_classes,), tf.float32),
      'prob': tf.FixedLenFeature((num_classes,), tf.float32),
      'loss': tf.FixedLenFeature([], tf.float32),
      'is_train': tf.FixedLenFeature([], tf.int64),
      'label': tf.FixedLenFeature([], tf.int64),
      'prediction': tf.FixedLenFeature([], tf.int64),
  }
  dataset = tf.data.RecordIODataset(data_path)
  results = {k: [] for k in features}
  ds = dataset.map(lambda x: tf.parse_single_example(x, features))
  for example in tfds.as_numpy(ds):
    for k in results:
      results[k].append(example[k])
  return utils.to_numpy(results)
 # ------------------------------------------------------------------------------
 #  Run attacks
 # ------------------------------------------------------------------------------
 def run_all_attacks(data_path: Text):
  """Train all possible attacks on the data at the given path."""
  logging.info('Load all features from %s...', data_path)
  features = load_all_features(data_path)
  for k, v in features.items():
    logging.info('%s: %s', k, v.shape)
  logging.info('Compute original train/test accuracy and loss...')
  train_idx = features['is_train'] == 1
  test_idx = np.logical_not(train_idx)
  correct = features['label'] == features['prediction']
  result = {
      'original_train_loss': np.mean(features['loss'][train_idx]),
      'original_test_loss': np.mean(features['loss'][test_idx]),
      'original_train_acc': np.mean(correct[train_idx]),
      'original_test_acc': np.mean(correct[test_idx]),
  }
  result.update(
      mia.run_all_attacks(
          loss_train=features['loss'][train_idx],
          loss_test=features['loss'][test_idx],
          logits_train=features['logits'][train_idx],
          logits_test=features['logits'][test_idx],
          labels_train=features['label'][train_idx],
          labels_test=features['label'][test_idx],
          attack_classifiers=('lr', 'mlp', 'rf', 'knn'),
          decimals=None))
  result = utils.ensure_1d(result)
  logging.info('Finished training and evaluating attacks.')
  return result
 def attacking():
  """Load data and model and extract relevant outputs."""
  # ---------- Set result path ----------
  if FLAGS.output:
    resultpath = FLAGS.output
  else:
    resultdir = FLAGS.inputdir
    if resultdir[-1] == '/':
      resultdir = resultdir[:-1]
    resultdir = '/'.join(resultdir.split('/')[:-1])
    resultpath = os.path.join(resultdir, FLAGS.result_name)
  # ---------- Glob attack training sets ----------
  logging.info('Glob attack data paths...')
  data_paths = sorted(glob(os.path.join(FLAGS.inputdir, '*')))
  logging.info('Found %d data paths', len(data_paths))
  # ---------- Iterate over attack dataset and train attacks ----------
  epochs = []
  results = []
  for i, datapath in enumerate(data_paths):
    logging.info('=' * 80)
    logging.info('Attack model %d / %d', i + 1, len(data_paths))
    logging.info('=' * 80)
    basename = os.path.basename(datapath)
    found_ints = re.findall(r'(\d+)', basename)
    if len(found_ints) == 1:
      epoch = int(found_ints[0])
      logging.info('Found integer %d in pathname, interpret as epoch', epoch)
    else:
      epoch = np.nan
    tmp_res = run_all_attacks(datapath)
    if tmp_res is not None:
      results.append(tmp_res)
      epochs.append(epoch)
  # ---------- Aggregate and save results ----------
  logging.info('Aggregate and combine all results over epochs...')
  results = utils.merge_dictionaries(results)
  results['epochs'] = np.array(epochs)
  logging.info('Store aggregate results at %s.', resultpath)
  with open(resultpath, 'wb') as fp:
    io_buffer = io.BytesIO()
    np.savez(io_buffer, **results)
    fp.write(io_buffer.getvalue())
  logging.info('Finished attacks.')
 def main(argv):
  del argv  # Unused
  attacking()
 if __name__ == '__main__':
  app.run(main)
--- a/tensorflow_privacy/privacy/membership_inference_attack/trained_attack_models.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/trained_attack_models.py
@ -0,0 +1,106 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 r"""A collection of sklearn models for binary classification.
 This module contains some sklearn pipelines for finding models for binary
 classification from a variable number of numerical input features.
 These models are used to train binary classifiers for membership inference.
 """
 from typing import Text
 import numpy as np
 from sklearn import ensemble
 from sklearn import linear_model
 from sklearn import model_selection
 from sklearn import neighbors
 from sklearn import neural_network
 def choose_model(attack_classifier: Text):
  """Choose a model based on a string classifier."""
  if attack_classifier == 'lr':
    return logistic_regression()
  elif attack_classifier == 'mlp':
    return mlp()
  elif attack_classifier == 'rf':
    return random_forest()
  elif attack_classifier == 'knn':
    return knn()
  else:
    raise ValueError(f'Unknown attack classifier {attack_classifier}.')
 def logistic_regression(verbose: int = 0, n_jobs: int = 1):
  """Setup a logistic regression pipeline with cross-validation."""
  lr = linear_model.LogisticRegression(solver='lbfgs')
  param_grid = {
      'C': np.logspace(-4, 2, 10),
  }
  pipe = model_selection.GridSearchCV(
      lr, param_grid=param_grid, cv=3, n_jobs=n_jobs, iid=False,
      verbose=verbose)
  return pipe
 def random_forest(verbose: int = 0, n_jobs: int = 1):
  """Setup a random forest pipeline with cross-validation."""
  rf = ensemble.RandomForestClassifier()
  n_estimators = [100]
  max_features = ['auto', 'sqrt']
  max_depth = [5, 10, 20]
  max_depth.append(None)
  min_samples_split = [2, 5, 10]
  min_samples_leaf = [1, 2, 4]
  random_grid = {'n_estimators': n_estimators,
                 'max_features': max_features,
                 'max_depth': max_depth,
                 'min_samples_split': min_samples_split,
                 'min_samples_leaf': min_samples_leaf}
  pipe = model_selection.RandomizedSearchCV(
      rf, param_distributions=random_grid, n_iter=7, cv=3, n_jobs=n_jobs,
      iid=False, verbose=verbose)
  return pipe
 def mlp(verbose: int = 0, n_jobs: int = 1):
  """Setup a MLP pipeline with cross-validation."""
  mlpmodel = neural_network.MLPClassifier()
  param_grid = {
      'hidden_layer_sizes': [(64,), (32, 32)],
      'solver': ['adam'],
      'alpha': [0.0001, 0.001, 0.01],
  }
  pipe = model_selection.GridSearchCV(
      mlpmodel, param_grid=param_grid, cv=3, n_jobs=n_jobs, iid=False,
      verbose=verbose)
  return pipe
 def knn(verbose: int = 0, n_jobs: int = 1):
  """Setup a k-nearest neighbors pipeline with cross-validation."""
  knnmodel = neighbors.KNeighborsClassifier()
  param_grid = {
      'n_neighbors': [3, 5, 7],
  }
  pipe = model_selection.GridSearchCV(
      knnmodel, param_grid=param_grid, cv=3, n_jobs=n_jobs, iid=False,
      verbose=verbose)
  return pipe
--- a/tensorflow_privacy/privacy/membership_inference_attack/utils.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/utils.py
@ -0,0 +1,218 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 """Utility functions for membership inference attacks."""
 from typing import Text, Dict, Union, List, Any, Tuple
 import numpy as np
 from sklearn import metrics
 ArrayDict = Dict[Text, np.ndarray]
 Dataset = Tuple[Tuple[np.ndarray, np.ndarray], Tuple[np.ndarray, np.ndarray]]
 # ------------------------------------------------------------------------------
 #  Utilities for managing result dictionaries
 # ------------------------------------------------------------------------------
 def to_numpy(in_dict: Dict[Text, Any]) -> ArrayDict:
  """Convert values of dict to numpy arrays.
  Warning: This may fail if the values cannot be converted to numpy arrays.
  Args:
    in_dict: A dictionary mapping Text keys to values where the values must be
      something that can be converted to a numpy array.
  Returns:
    a dictionary with the same keys as input with all values converted to numpy
        arrays
  """
  return {k: np.array(v) for k, v in in_dict.items()}
 def ensure_1d(in_dict: Dict[Text, Union[int, float, np.ndarray]]) -> ArrayDict:
  """Ensure all values of a dictionary are at least 1D numpy arrays.
  Args:
    in_dict: The input dictionary mapping Text keys to numpy arrays or numbers.
  Returns:
    dictionary with same keys as in_dict and values converted to numpy arrays
        with at least one dimension (i.e., pack scalars into arrays)
  """
  return {k: np.atleast_1d(v) for k, v in in_dict.items()}
 def prepend_to_keys(in_dict: Dict[Text, Any], prefix: Text) -> Dict[Text, Any]:
  """Prepend a prefix to all keys of a dictionary.
  Args:
    in_dict: The input dictionary mapping Text keys to numpy arrays.
    prefix: Text which to prepend to each key in in_dict
  Returns:
    dictionary with same values as in_dict and all keys having prefix prepended
        to them
  """
  return {prefix + k: v for k, v in in_dict.items()}
 # ------------------------------------------------------------------------------
 #  Subsampling and data selection functionality
 # ------------------------------------------------------------------------------
 def select_indices(in_dict: ArrayDict, indices: np.ndarray) -> ArrayDict:
  """Subsample all values in the dictionary by the provided indices.
  Args:
    in_dict: The input dictionary mapping Text keys to numpy array values.
    indices: A numpy which can be used to index other arrays, specifying the
      indices to subsample from in_dict values.
  Returns:
    dictionary with same keys as in_dict and subsampled values
  """
  return {k: v[indices] for k, v in in_dict.items()}
 def merge_dictionaries(res: List[ArrayDict]) -> ArrayDict:
  """Convert iterable of dicts to dict of iterables."""
  output = {k: np.empty(0) for k in res[0]}
  for k in output:
    output[k] = np.concatenate([r[k] for r in res if k in r], axis=0)
  return output
 def get_features(features: ArrayDict,
                 feature_name: Text,
                 top_k: int,
                 add_loss: bool = False) -> np.ndarray:
  """Combine the specified features into one array.
  Args:
    features: A dictionary containing all possible features.
    feature_name: Which feature to use (logits or prob).
    top_k: The number of the top features (of feature_name) to select.
    add_loss: Whether to also add the loss as a feature.
  Returns:
    combined numpy array with the selected features (n_examples, n_features)
  """
  if top_k < 1:
    raise ValueError('Must select at least one feature.')
  feats = np.sort(features[feature_name], axis=-1)[:, :top_k]
  if add_loss:
    feats = np.concatenate((feats, features['loss'][:, np.newaxis]), axis=-1)
  return feats
 def subsample_to_balance(features: ArrayDict, random_state: int) -> ArrayDict:
  """Subsample if necessary to balance labels."""
  train_idx = features['is_train'] == 1
  test_idx = np.logical_not(train_idx)
  n0 = np.sum(test_idx)
  n1 = np.sum(train_idx)
  if n0 < 20 or n1 < 20:
    raise RuntimeError('Need at least 20 examples from training and test set.')
  np.random.seed(random_state)
  if n0 > n1:
    use_idx = np.random.choice(np.where(test_idx)[0], n1, replace=False)
    use_idx = np.concatenate((use_idx, np.where(train_idx)[0]))
    features = {k: v[use_idx] for k, v in features.items()}
  elif n0 < n1:
    use_idx = np.random.choice(np.where(train_idx)[0], n0, replace=False)
    use_idx = np.concatenate((use_idx, np.where(test_idx)[0]))
    features = {k: v[use_idx] for k, v in features.items()}
  return features
 def get_train_test_split(features: ArrayDict, add_loss: bool,
                         test_size: float) -> Dataset:
  """Get training and test data split."""
  y = features['is_train']
  n_total = len(y)
  n_test = int(test_size * n_total)
  perm = np.random.permutation(len(y))
  test_idx = perm[:n_test]
  train_idx = perm[n_test:]
  y_train = y[train_idx]
  y_test = y[test_idx]
  # We are using 10 top logits as a good default value if there are more than 10
  # classes. Typically, there is no significant amount of weight in more than
  # 10 logits.
  n_logits = min(features['logits'].shape[1], 10)
  x = get_features(features, 'logits', n_logits, add_loss)
  x_train, x_test = x[train_idx], x[test_idx]
  return (x_train, y_train), (x_test, y_test)
 # ------------------------------------------------------------------------------
 #  Computation of the attack metrics
 # ------------------------------------------------------------------------------
 def compute_performance_metrics(true_labels: np.ndarray,
                                predictions: np.ndarray,
                                threshold: float = None) -> ArrayDict:
  """Compute relevant classification performance metrics.
  The outout metrics are
  1.arrays of thresholds and corresponding true and false positives (fpr, tpr).
  2.auc area under fpr-tpr curve.
  3.advantage max difference between tpr and fpr.
  4.precision/recall/accuracy/f1_score if threshold arg is given.
  Args:
    true_labels: True labels.
    predictions: Predicted probabilities/scores.
    threshold: The threshold to use on `predictions` binary classification.
  Returns:
    A dictionary with relevant metrics which are fully described by their key.
  """
  results = {}
  if threshold is not None:
    results.update({
        'precision':
            metrics.precision_score(true_labels, predictions > threshold),
        'recall':
            metrics.recall_score(true_labels, predictions > threshold),
        'accuracy':
            metrics.accuracy_score(true_labels, predictions > threshold),
        'f1_score':
            metrics.f1_score(true_labels, predictions > threshold),
    })
  fpr, tpr, thresholds = metrics.roc_curve(true_labels, predictions)
  auc = metrics.auc(fpr, tpr)
  advantage = np.max(np.abs(tpr - fpr))
  results.update({
      'fpr': fpr,
      'tpr': tpr,
      'thresholds': thresholds,
      'auc': auc,
      'advantage': advantage,
  })
  return ensure_1d(results)
--- a/tensorflow_privacy/privacy/membership_inference_attack/utils_test.py
+++ b/tensorflow_privacy/privacy/membership_inference_attack/utils_test.py
@ -0,0 +1,105 @@
 # Copyright 2020, The TensorFlow Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Lint as: python3
 """Tests for tensorflow_privacy.privacy.membership_inference_attack.utils."""
 from absl.testing import absltest
 import numpy as np
 from tensorflow_privacy.privacy.membership_inference_attack import utils
 class UtilsTest(absltest.TestCase):
  def __init__(self, methodname):
    """Initialize the test class."""
    super().__init__(methodname)
    rng = np.random.RandomState(33)
    logits = rng.uniform(low=0, high=1, size=(1000, 14))
    loss = rng.uniform(low=0, high=1, size=(1000,))
    is_train = rng.binomial(1, 0.7, size=(1000,))
    self.mydict = {'logits': logits, 'loss': loss, 'is_train': is_train}
  def test_compute_metrics(self):
    """Test computation of attack metrics."""
    true = np.array([0, 0, 0, 1, 1, 1])
    pred = np.array([0.6, 0.9, 0.4, 0.8, 0.7, 0.2])
    results = utils.compute_performance_metrics(true, pred, threshold=0.5)
    for k in ['precision', 'recall', 'accuracy', 'f1_score', 'fpr', 'tpr',
              'thresholds', 'auc', 'advantage']:
      self.assertIn(k, results)
    np.testing.assert_almost_equal(results['accuracy'], 1. / 2.)
    np.testing.assert_almost_equal(results['precision'], 2. / (2. + 2.))
    np.testing.assert_almost_equal(results['recall'], 2. / (2. + 1.))
  def test_prepend_to_keys(self):
    """Test prepending of text to keys of a dictionary."""
    mydict = utils.prepend_to_keys(self.mydict, 'test')
    for k in mydict:
      self.assertTrue(k.startswith('test'))
  def test_select_indices(self):
    """Test selecting indices from dictionary with array values."""
    mydict = {'a': np.arange(10), 'b': np.linspace(0, 1, 10)}
    idx = np.arange(5)
    mydictidx = utils.select_indices(mydict, idx)
    np.testing.assert_allclose(mydictidx['a'], np.arange(5))
    np.testing.assert_allclose(mydictidx['b'], np.linspace(0, 1, 10)[:5])
    idx = np.array([1, 0, 1, 0, 1, 0, 1, 0, 1, 0]) > 0.5
    mydictidx = utils.select_indices(mydict, idx)
    np.testing.assert_allclose(mydictidx['a'], np.arange(0, 10, 2))
    np.testing.assert_allclose(mydictidx['b'], np.linspace(0, 1, 10)[0:10:2])
  def test_get_features(self):
    """Test extraction of features."""
    for k in [1, 5, 10, 15]:
      for add_loss in [True, False]:
        feats = utils.get_features(
            self.mydict, 'logits', top_k=k, add_loss=add_loss)
        k_selected = min(k, 14)
        self.assertEqual(feats.shape, (1000, k_selected + int(add_loss)))
  def test_subsample_to_balance(self):
    """Test subsampling of two arrays."""
    feats = utils.subsample_to_balance(self.mydict, random_state=23)
    train = np.sum(self.mydict['is_train'])
    test = 1000 - train
    n_chosen = min(train, test)
    self.assertEqual(feats['logits'].shape, (2 * n_chosen, 14))
    self.assertEqual(feats['loss'].shape, (2 * n_chosen,))
    self.assertEqual(np.sum(feats['is_train']), n_chosen)
    self.assertEqual(np.sum(1 - feats['is_train']), n_chosen)
  def test_get_data(self):
    """Test train test split data generation."""
    for test_size in [0.2, 0.5, 0.8, 0.55555]:
      (x_train, y_train), (x_test, y_test) = utils.get_train_test_split(
          self.mydict, add_loss=True, test_size=test_size)
      n_test = int(test_size * 1000)
      n_train = 1000 - n_test
      self.assertEqual(x_train.shape, (n_train, 11))
      self.assertEqual(y_train.shape, (n_train,))
      self.assertEqual(x_test.shape, (n_test, 11))
      self.assertEqual(y_test.shape, (n_test,))
 if __name__ == '__main__':
  absltest.main()