Mechanisms for Global Differential Privacy under Bayesian Data Synthesis

05/10/2022
by   Jingchen Hu, et al.
0

This paper introduces a new method that embeds any Bayesian model used to generate synthetic data and converts it into a differentially private (DP) mechanism. We propose an alteration of the model synthesizer to utilize a censored likelihood that induces upper and lower bounds of [exp(-ϵ / 2), exp(ϵ / 2)], where ϵ denotes the level of the DP guarantee. This censoring mechanism equipped with an ϵ-DP guarantee will induce distortion into the joint parameter posterior distribution by flattening or shifting the distribution towards a weakly informative prior. To minimize the distortion in the posterior distribution induced by likelihood censoring, we embed a vector-weighted pseudo posterior mechanism within the censoring mechanism. The pseudo posterior is formulated by selectively downweighting each likelihood contribution proportionally to its disclosure risk. On its own, the pseudo posterior mechanism produces a weaker asymptotic differential privacy (aDP) guarantee. After embedding in the censoring mechanism, the DP guarantee becomes strict such that it does not rely on asymptotics. We demonstrate that the pseudo posterior mechanism creates synthetic data with the highest utility at the price of a weaker, aDP guarantee, while embedding the pseudo posterior mechanism in the proposed censoring mechanism produces synthetic data with a stronger, non-asymptotic DP guarantee at the cost of slightly reduced utility. The perturbed histogram mechanism is included for comparison.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2019

Bayesian Pseudo Posterior Mechanism under Differential Privacy

We propose a Bayesian pseudo posterior mechanism to generate record-leve...
research
06/01/2020

Re-weighting of Vector-weighted Mechanisms for Utility Maximization under Differential Privacy

We implement a pseudo posterior synthesizer for microdata dissemination ...
research
01/15/2021

Private Tabular Survey Data Products through Synthetic Microdata Generation

We propose three synthetic microdata approaches to generate private tabu...
research
10/25/2022

Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

Privacy concerns have attracted increasing attention in data-driven prod...
research
07/31/2018

Bayesian Uncertainty Estimation Under Complex Sampling

Multistage sampling designs utilized by federal statistical agencies are...
research
06/03/2020

One Step to Efficient Synthetic Data

We propose a general method of producing synthetic data, which is widely...
research
01/21/2023

Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms

Marginal-based methods achieve promising performance in the synthetic da...

Please sign up or login with your details

Forgot password? Click here to reset