Integral Privacy for Density Estimation with Approximation Guarantees
Density estimation is an old and central problem in statistics and machine learning. There exists only few approaches to cast this problem in a differential privacy framework and to our knowledge, while all provide proofs of security, very little is still known about the approximation guarantees of the unknown density by the private one learned. In this paper, we exploit the tools of boosting to show that, provided we have access to a weak learner in the original boosting sense, there exists a way to learn a private density out of classifiers, which can guarantee an approximation of the true density that degrades gracefully as the privacy budget ϵ decreases. There are three key formal features of our results: (i) our approximation bound is, as we show, near optimal for our technique at hand and (ii) the privacy guarantee holds even when we remove the famed adjacency condition of inputs in differential privacy, thereby leading to a stronger privacy guarantee we relate to as integral privacy. Finally, (iii) we provide for the first time approximation guarantees for the capture of fat regions of the density, a problem which is receiving a lot of attention in the generative adversarial networks literature with the mode capture problem. Experimental results against a state of the art implementation of private kernel density estimation display that our technique consistently obtains improved results, managing in particular to get similar outputs for a privacy budget ϵ which is however orders of magnitude smaller.
READ FULL TEXT