CIPHER: Construction of dIfferentially Private microdata from low-dimensional Histograms via solving linear Equations with Tikhonov Regularization

12/12/2018
by   Evercita C. Eugenio, et al.
0

When government agencies, research institutes, industries release individual-level data for research and public use, the data is often perturbed in certain ways to provide some level of privacy protection. The recently developed differentially private data synthesis (DIPS) methods are built upon the concept of differential privacy and provides a strong mathematical privacy guarantee while aiming to maintain the statistical utility of the released sanitized data. We introduce a new DIPS algorithm, CIPHER, which generates differential private individual-level data from a set of low-dimensional histograms via solving a set of linear equations with the Tikhonov (l_2) regularization. CIPHER is conceptually very simple and requires nothing than decomposing joint probabilities via basic provability rules to construct the equation set and subsequently solving linear equations. CIPHER also has the ability to automatically "correct" for the inconsistency arising from the differential private sanitization among the histograms that share at least one common attributes. We compare CIPHER with the MWEM (multiplicative weighting via exponential mechanism) and the full-dimensional histogram (FDH) sanitization through simulation and case studies. The results demonstrate that CIPHER made significance improvements over MWEM in statistical inferences, both of which aim to generate differentially private synthetic individual-level data from a set of low-dimensional histograms. CIPHER also delivered similar performance as the FDH sanitation for most of the examined privacy budget range.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset