Differentially Private Diffusion Models

10/18/2022
by   Tim Dockhorn, et al.
0

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge, providing access to synthetic data instead. However, training DP generative models is highly challenging due to the noise injected into training to enforce DP. We propose to leverage diffusion models (DMs), an emerging class of deep generative models, and introduce Differentially Private Diffusion Models (DPDMs), which enforce privacy using differentially private stochastic gradient descent (DP-SGD). We motivate why DP-SGD is well suited for training DPDMs, and thoroughly investigate the DM parameterization and the sampling algorithm, which turn out to be crucial ingredients in DPDMs. Furthermore, we propose noise multiplicity, a simple yet powerful modification of the DM training objective tailored to the DP setting to boost performance. We validate our novel DPDMs on widely-used image generation benchmarks and achieve state-of-the-art (SOTA) performance by large margins. For example, on MNIST we improve the SOTA FID from 48.4 to 5.01 and downstream classification accuracy from 83.2 δ=10^-5). Moreover, on standard benchmarks, classifiers trained on DPDM-generated synthetic data perform on par with task-specific DP-SGD-trained classifiers, which has not been demonstrated before for DP generative models. Project page and code: https://nv-tlabs.github.io/DPDM.

READ FULL TEXT

page 2

page 7

page 29

page 30

page 31

page 32

page 33

page 34

research
11/01/2021

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

Although machine learning models trained on massive data have led to bre...
research
05/25/2022

Differentially Private Data Generation Needs Better Features

Training even moderately-sized generative models with differentially-pri...
research
05/18/2023

Understanding how Differentially Private Generative Models Spend their Privacy Budget

Generative models trained with Differential Privacy (DP) are increasingl...
research
06/02/2023

Harnessing large-language models to generate private synthetic text

Differentially private (DP) training methods like DP-SGD can protect sen...
research
10/18/2022

Improving Adversarial Robustness by Contrastive Guided Diffusion Process

Synthetic data generation has become an emerging tool to help improve th...
research
06/08/2021

PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning

We propose a new framework of synthesizing data using deep generative mo...
research
11/17/2021

Network Generation with Differential Privacy

We consider the problem of generating private synthetic versions of real...

Please sign up or login with your details

Forgot password? Click here to reset