Extracting Training Data from Diffusion Models

01/30/2023
by   Nicholas Carlini, et al.
0

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

READ FULL TEXT

page 4

page 5

page 9

page 13

page 22

page 25

page 30

page 31

research
02/27/2023

Differentially Private Diffusion Models Generate Useful Synthetic Images

The ability to generate privacy-preserving synthetic versions of sensiti...
research
06/03/2023

Training Data Attribution for Diffusion Models

Diffusion models have become increasingly popular for synthesizing high-...
research
03/04/2023

Diffusion Models Generate Images Like Painters: an Analytical Theory of Outline First, Details Later

How do diffusion generative models convert pure noise into meaningful im...
research
03/29/2023

HoloDiffusion: Training a 3D Diffusion Model using 2D Images

Diffusion models have emerged as the best approach for generative modeli...
research
08/22/2023

Hey That's Mine Imperceptible Watermarks are Preserved in Diffusion Generated Outputs

Generative models have seen an explosion in popularity with the release ...
research
07/08/2023

Measuring the Success of Diffusion Models at Imitating Human Artists

Modern diffusion models have set the state-of-the-art in AI image genera...
research
01/31/2023

Learning Data Representations with Joint Diffusion Models

We introduce a joint diffusion model that simultaneously learns meaningf...

Please sign up or login with your details

Forgot password? Click here to reset