Commit graph

115 commits

Author SHA1 Message Date
Galen Andrew
d5dcfec745 Remove set_denominator functions from DPQuery and make QueryWithLedger easier to use.
set_denominator was added so that the batch size doesn't need to be specified before constructing the optimizer, but it breaks the DPQuery abstraction. Now the optimizer uses a GaussianSumQuery instead of GaussianAverageQuery, and normalization by batch size is done inside the optimizer.

Also instead of creating all DPQueries with a PrivacyLedger and then wrapping with QueryWithLedger, it is now sufficient to create the queries with no ledger and QueryWithLedger will construct the ledger and pass it to all inner queries.

PiperOrigin-RevId: 251462353
2019-06-04 10:14:32 -07:00
Galen Andrew
7636945566 Cast to ensure record of NoPrivacyAverageQuery is float for compatibility with sample_state.
PiperOrigin-RevId: 249909614
2019-05-24 15:28:39 -07:00
Steve Chien
15c07250a1 Add dtype=tf.int32 to TensorBuffer capacity and current size.
PiperOrigin-RevId: 249908717
2019-05-24 15:22:43 -07:00
Nicolas Papernot
a06bc6c99b fix imports for v1 and make the versioning more explicit through LooseVersion
PiperOrigin-RevId: 249732562
2019-05-23 15:57:08 -07:00
Ilya Mironov
0efb23afcb Changing initial capacity for the ledger to smaller values. (+ restoring compatibility with Python 2)
PiperOrigin-RevId: 249292683
2019-05-21 11:38:27 -07:00
Galen Andrew
7992006077 Add quantile_adaptive_clip_sum_query to privacy package.
PiperOrigin-RevId: 248617353
2019-05-16 15:56:16 -07:00
Galen Andrew
3908429796 Make DPQuery classes (almost) completely functional: the only state from the initializer that is used gets pushed into the initial_global_state.
PiperOrigin-RevId: 248424593
2019-05-15 16:06:37 -07:00
Steve Chien
17fefb3895 Remove tf.function annotation from quantile_adaptive_clip_sum_query.py that was causing import error.
PiperOrigin-RevId: 248236331
2019-05-14 16:40:55 -07:00
Galen Andrew
aaf029edad Add quantile_adaptive_clip_sum_query which dynamically adjusts the clipping norm so a specified fraction of records per sample are clipped.
PiperOrigin-RevId: 248201320
2019-05-14 13:35:29 -07:00
Galen Andrew
1d1a6e087a Extensions to DPQuery and subclasses.
1. Split DPQuery.accumulate_record function into preprocess_record and accumulate_preprocessed_record.
2. Add merge_sample_state function.
3. Add default implementations for some functions in DPQuery, and add base class SumAggregationDPQuery that implements some more. Only get_noised_result is still abstract.
4. Enforce that all states and parameters used as inputs and outputs to DPQuery functions are nested structures of tensors by replacing numbers with constants and Nones with empty tuples.

PiperOrigin-RevId: 247975791
2019-05-13 11:28:56 -07:00
Steve Chien
82852c0e71 Add comments explaining the relationship between ML terminology and DP terminology.
PiperOrigin-RevId: 246926753
2019-05-06 17:12:24 -07:00
Ilya Mironov
9cece21d92 Clean-up pass to eliminate warnings: replacing deprecated endpoints with recommended versions and annotating test sizes.
PiperOrigin-RevId: 246901723
2019-05-06 14:50:23 -07:00
Steve Chien
098940bf1b Very minor formatting change.
PiperOrigin-RevId: 246596295
2019-05-03 16:41:38 -07:00
Steve Chien
28639ba0a8 Allow tensor buffers to automatically resize as needed.
PiperOrigin-RevId: 246594454
2019-05-03 16:30:02 -07:00
Steve Chien
beb86c6e18 Update PrivacyLedger and DPOptimizer to make certain arguments optional.
PiperOrigin-RevId: 246235646
2019-05-01 18:07:32 -07:00
Nicolas Papernot
c09ec4c22b minor fixes to improve tf 1 and 2 compatibility
PiperOrigin-RevId: 246008822
2019-04-30 13:22:57 -07:00
Nicolas Papernot
febafd830d update API calls for TF2
PiperOrigin-RevId: 245817981
2019-04-29 14:00:40 -07:00
Ilya Mironov
a3e03f773e Adding a paragraph to the walk-through on how to choose RDP orders. Plus deleting empty lines in rdp_accountant.py. Fixing issue #47.
PiperOrigin-RevId: 244467825
2019-04-19 21:50:49 -07:00
Steve Chien
31219a5f3f Fix DP optimizers to handle gradients that are None.
PiperOrigin-RevId: 244429987
2019-04-19 15:07:55 -07:00
Galen Andrew
28df9cf233 Add missing deps to privacy __init__.py.
PiperOrigin-RevId: 244249099
2019-04-18 13:39:09 -07:00
Ilya Mironov
51e29667d9 Fixing issue #44 (imports in privacy/__init__.py). Added __init__.py to the dp_query directory for Python 2 compatibility.
PiperOrigin-RevId: 243329997
2019-04-12 14:10:48 -07:00
Ilya Mironov
3c4409d6d6 Restoring ability to run compute_dp_sgd_privacy.py as a standalone script.
At present, the script has no heavy dependencies except for the rdp_accountant, which is by itself pretty light-weight. However, importing rdp_accountant triggers __init__.py in third_party/py/tensorflow_privacy/privacy, which loads TF and all of tf.privacy. The CL adds a check to the __init__.py, which controls this behavior.

PiperOrigin-RevId: 243172355
2019-04-11 17:06:53 -07:00
Galen Andrew
e8113a0365 Add DummyOptimizer to top-level imports.
PiperOrigin-RevId: 242715034
2019-04-09 12:14:22 -07:00
Galen Andrew
9106a04e2c Use PrivacyLedger for privacy accounting.
Prior to this change the PrivacyLedger is running to keep a log of private queries, but the ledger is not actually used to compute the (epsilon, delta) guarantees. This CL adds a function to compute the RDP directly from the ledger.

Note I did verify that the tutorial builds and runs with the changes and for the first few iterations prints the same epsilon values as before the change.

PiperOrigin-RevId: 241063532
2019-03-29 15:31:32 -07:00
Nicolas Papernot
01e7cac7b5
Update compute_dp_sgd_privacy.py 2019-03-27 09:22:58 -07:00
Nicolas Papernot
8db2dd6bca
Update compute_dp_sgd_privacy.py 2019-03-27 09:18:45 -07:00
Shadi Rahimian
0a0e5cb3c3
line 83 produces error
TypeError: can only concatenate list (not "range") to list
2019-03-26 15:28:09 +01:00
Galen Andrew
6231d0802d Cleanup directory structure, add top-level imports and add normalized_query.
Moved query classes from dir optimizers into new dir dp_query. Added NormalizedQuery class for queries that divide the output of another query by a constant like GaussianAverageQuery.

PiperOrigin-RevId: 240167115
2019-03-25 10:21:04 -07:00
Nicolas Papernot
0ebd134d99 Closes #33
PiperOrigin-RevId: 239129202
2019-03-18 22:42:01 -07:00
Galen Andrew
0aad84ab3f Move mpmath dependency to unittest target that uses it, and explicit import of mpmath functions to reduce size.
PiperOrigin-RevId: 239056360
2019-03-18 14:03:47 -07:00
Nicolas Papernot
a9840529c4 Closes #29
PiperOrigin-RevId: 239030654
2019-03-18 11:54:20 -07:00
Galen Andrew
9a53e1eb86 Adds AdaptiveClipAverageQuery which performs adaptive adjustment of the clipping norm to approximate a specified quantile of clipped updates per round.
PiperOrigin-RevId: 238698171
2019-03-15 13:19:22 -07:00
Galen Andrew
e566967ff6 Simplify GaussianQuery by removing _GlobalState.
The global state for DP query is intended for aspects of the query that change across samples under the query's own control. It was therefore unnecessary to wrap "l2_norm_clip" and "sum_stddev" in the namedtuple _GlobalState for the basic GaussianQuery classes.

PiperOrigin-RevId: 237528962
2019-03-08 15:17:48 -08:00
cclauss
d9780c043e
from six.moves import xrange
__xrange()__ was removed in Python 3 in favor of a reworked version of __range()__.

[flake8](http://flake8.pycqa.org) testing of https://github.com/tensorflow/privacy on Python 3.7.1

$ __flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics__
```
./privacy/optimizers/gaussian_query_test.py:65:16: F821 undefined name 'xrange'
      for _ in xrange(1000):
               ^
./research/pate_2018/ICLR2018/rdp_bucketized.py:79:12: F821 undefined name 'xrange'
  for i in xrange(n):
           ^
./research/pate_2018/ICLR2018/rdp_bucketized.py:106:12: F821 undefined name 'xrange'
  for i in xrange(n):
           ^
./research/pate_2018/ICLR2018/rdp_bucketized.py:139:12: F821 undefined name 'xrange'
  for i in xrange(n):
           ^
4     F821 undefined name 'xrange'
4
```
__E901,E999,F821,F822,F823__ are the "_showstopper_" [flake8](http://flake8.pycqa.org) issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.
* F821: undefined name `name`
* F822: undefined name `name` in `__all__`
* F823: local variable name referenced before assignment
* E901: SyntaxError or IndentationError
* E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree
2019-03-08 10:35:01 +01:00
Steve Chien
b892d650cf Tests for Eager mode.
PiperOrigin-RevId: 236382269
2019-03-01 14:48:29 -08:00
Nicolas Papernot
0c691085e1 missing reduce_mean
PiperOrigin-RevId: 235858614
2019-02-26 22:56:29 -08:00
Nicolas Papernot
c2d4b17881 Add support for the Eager mode
PiperOrigin-RevId: 235733975
2019-02-26 09:20:28 -08:00
Steve Chien
f37c9d1ea1 Use mean loss within each microbatch.
PiperOrigin-RevId: 233832864
2019-02-13 14:42:12 -08:00
Eugene Brevdo
72305bcb10 Update use of tf.CriticalSection.
PiperOrigin-RevId: 233168852
2019-02-08 20:08:35 -08:00
Steve Chien
d75f1b80ba Fix copybara to limit some transformations to the beginning of the line.
PiperOrigin-RevId: 233151293
2019-02-08 16:55:57 -08:00
A. Unique TensorFlower
4d0ab48c35 Add privacy ledger.
The privacy ledger keeps a record of all sampling and query events for analysis post hoc by the privacy accountant.

PiperOrigin-RevId: 233094012
2019-02-08 11:21:43 -08:00
A. Unique TensorFlower
36d9959c19 internal change
PiperOrigin-RevId: 233093203
2019-02-08 11:17:09 -08:00
A. Unique TensorFlower
098c5220b5 Remove test broken by upstream tf changes.
tf.nest.map_structure_up_to has changed so that map_structure_up_to(x, func, x, y) no longer raises an error when y is longer than x, for example x=[1,2], y=[1,2,3]. This broke one of our tests for nested query. Remove the test until (if and when) the old, more reasonable, behavior is restored.

PiperOrigin-RevId: 232057385
2019-02-01 16:22:05 -08:00
A. Unique TensorFlower
4f9cc8ef3e 1. Adding a CLI script for computing privacy loss for DP-SGD.
2. Fixing typos in the MNIST tutorial.

PiperOrigin-RevId: 230608908
2019-01-23 14:56:27 -08:00
Galen Andrew
c8cb3c6b70 General cleanup.
1. Rename PrivateQuery to DPQuery.
2. Move construction of DPQuery to outside of optimizer.
3. Remove PrivateAverageQuery and PrivateSumQuery, and rename DPQuery's 'get_query_result' method to 'get_noised_result'. Rename private_queries.py to dp_query.py.
4. Remove thrice-replicated run_query function from the test classes and replace with a single function in new test_utils.py.
5. Add functions gaussian_sum_query_from_noise_multplier and gaussian_average_query_from_noise_multplier.

PiperOrigin-RevId: 230595991
2019-01-23 14:41:44 -08:00
Steve Chien
7e2d796bde Minor fixes for Python 2/3 compatibility.
PiperOrigin-RevId: 230022543
2019-01-18 17:33:22 -08:00
Peter Kairouz
0b56f7c016 Add optional argument for weighted sum and weighted average queries.
PiperOrigin-RevId: 230021515
2019-01-18 17:25:58 -08:00
Alex Pine
6c5c39c4f2 Created the optional unroll_microbatches parameter for the DpOptimizerClass as a workaround for b/122613513.
PiperOrigin-RevId: 229955297
2019-01-18 11:51:44 -08:00
Peter Kairouz
5ee12803f3 Create NoPrivacySumQuery and NoPrivacyAverageQuery.
PiperOrigin-RevId: 229273971
2019-01-14 17:39:36 -08:00
Nicolas Papernot
398d1d052f Closes #4
PiperOrigin-RevId: 229212766
2019-01-14 10:55:15 -08:00
Steve Chien
c30f6d776e Add test to ensure DP optimizers work with tf.estimator Estimators.
PiperOrigin-RevId: 228920704
2019-01-11 15:50:56 -08:00
schien
251d6298c6 Fix Python 3 compatibility issues
PiperOrigin-RevId: 228232503
2019-01-07 14:09:31 -08:00
Steve Chien
1689cf3a77
Revert "Fix Python 3 compatibility issues" 2019-01-07 13:44:35 -08:00
Steve Chien
8c53cf8f75
Merge pull request #1 from cclauss/python3-fixes
Fix Python 3 compatibility issues
2019-01-07 13:43:22 -08:00
A. Unique TensorFlower
29fac758af - Fixing dependencies in setup.py and requirements.txt.
PiperOrigin-RevId: 227742524
2019-01-04 15:58:19 -08:00
A. Unique TensorFlower
01ab549902 Renaming stddev_to_sensitivity_ratio to noise_multiplier in rdp_accountant.
PiperOrigin-RevId: 227552068
2019-01-04 15:57:52 -08:00
A. Unique TensorFlower
da79d522c6 Project import generated by Copybara.
PiperOrigin-RevId: 226991741
2018-12-26 23:31:59 -08:00
cclauss
d72378f913 Fix Python 3 compatibility issues 2018-12-24 03:55:12 +01:00
A. Unique TensorFlower
b4188446e0 Project import generated by Copybara.
PiperOrigin-RevId: 226345615
2018-12-20 09:17:59 -08:00
Steve Chien
1595ed3cd1 Project import generated by Copybara.
PiperOrigin-RevId: 226056146
2018-12-18 15:44:04 -08:00
A. Unique TensorFlower
ceee90b1ac Add GaussianSumQuery and express GaussianAverageQuery in terms of it.
Also:
1. Add unit tests for both types of query.
2. Add function "get_query_result" to PrivateQuery. (The utility of having this function is made clear in the test class, where the function _run_query operates on either GaussianSum- or GaussianAverageQueries.)
PiperOrigin-RevId: 225609398
2018-12-18 15:41:38 -08:00
Steve Chien
0af76c7b3d Update to allow bazel on tensorflow_privacy to work out of the box.
PiperOrigin-RevId: 225605386
2018-12-18 15:41:26 -08:00
Steve Chien
b8418b0523 PiperOrigin-RevId: 224078477 2018-12-05 14:03:22 -08:00
Steve Chien
afb8189dba PiperOrigin-RevId: 224061027 2018-12-04 17:01:39 -08:00
A. Unique TensorFlower
e9169a724e Project import generated by Copybara.
FolderOrigin-RevId: /google/src/cloud/papernot/os_privacy
2018-12-02 13:07:04 -08:00