An Uncertainty Principle is a Price of Privacy-Preserving Microdata

10/25/2021
by   John Abowd, et al.
0

Privacy-protected microdata are often the desired output of a differentially private algorithm since microdata is familiar and convenient for downstream users. However, there is a statistical price for this kind of convenience. We show that an uncertainty principle governs the trade-off between accuracy for a population of interest ("sum query") vs. accuracy for its component sub-populations ("point queries"). Compared to differentially private query answering systems that are not required to produce microdata, accuracy can degrade by a logarithmic factor. For example, in the case of pure differential privacy, without the microdata requirement, one can provide noisy answers to the sum query and all point queries while guaranteeing that each answer has squared error O(1/ϵ^2). With the microdata requirement, one must choose between allowing an additional log^2(d) factor (d is the number of point queries) for some point queries or allowing an extra O(d^2) factor for the sum query. We present lower bounds for pure, approximate, and concentrated differential privacy. We propose mitigation strategies and create a collection of benchmark datasets that can be used for public study of this problem.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset