Simulation-based, Finite-sample Inference for Privatized Data
Privacy protection methods, such as differentially private mechanisms, introduce noise into resulting statistics which often results in complex and intractable sampling distributions. In this paper, we propose to use the simulation-based "repro sample" approach to produce statistically valid confidence intervals and hypothesis tests based on privatized statistics. We show that this methodology is applicable to a wide variety of private inference problems, appropriately accounts for biases introduced by privacy mechanisms (such as by clamping), and improves over other state-of-the-art inference methods such as the parametric bootstrap in terms of the coverage and type I error of the private inference. We also develop significant improvements and extensions for the repro sample methodology for general models (not necessarily related to privacy), including 1) modifying the procedure to ensure guaranteed coverage and type I errors, even accounting for Monte Carlo error, and 2) proposing efficient numerical algorithms to implement the confidence intervals and p-values.
READ FULL TEXT