Measuring the Success of Diffusion Models at Imitating Human Artists

by   Stephen Casper, et al.

Modern diffusion models have set the state-of-the-art in AI image generation. Their success is due, in part, to training on Internet-scale data which often includes copyrighted work. This prompts questions about the extent to which these models learn from, imitate, or copy the work of human artists. This work suggests that tying copyright liability to the capabilities of the model may be useful given the evolving ecosystem of generative models. Specifically, much of the legal analysis of copyright and generative systems focuses on the use of protected data for training. As a result, the connections between data, training, and the system are often obscured. In our approach, we consider simple image classification techniques to measure a model's ability to imitate specific artists. Specifically, we use Contrastive Language-Image Pretrained (CLIP) encoders to classify images in a zero-shot fashion. Our process first prompts a model to imitate a specific artist. Then, we test whether CLIP can be used to reclassify the artist (or the artist's work) from the imitation. If these tests match the imitation back to the original artist, this suggests the model can imitate that artist's expression. Our approach is simple and quantitative. Furthermore, it uses standard techniques and does not require additional training. We demonstrate our approach with an audit of Stable Diffusion's capacity to imitate 70 professional digital artists with copyrighted work online. When Stable Diffusion is prompted to imitate an artist from this set, we find that the artist can be identified from the imitation with an average accuracy of 81.0 artist's work can be matched to these imitation images with a high degree of statistical reliability. Overall, these results suggest that Stable Diffusion is broadly successful at imitating individual human artists.


page 2

page 3

page 4


Your Diffusion Model is Secretly a Zero-Shot Classifier

The recent wave of large-scale text-to-image diffusion models has dramat...

Extracting Training Data from Diffusion Models

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion ha...

Evaluating a Synthetic Image Dataset Generated with Stable Diffusion

We generate synthetic images with the "Stable Diffusion" image generatio...

Text-to-Image Diffusion Models are Zero-Shot Classifiers

The excellent generative capabilities of text-to-image diffusion models ...

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis

We present multimodal conditioning modules (MCM) for enabling conditiona...

GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models

Recent text-to-image diffusion models such as MidJourney and Stable Diff...

On the De-duplication of LAION-2B

Generative models, such as DALL-E, Midjourney, and Stable Diffusion, hav...

Please sign up or login with your details

Forgot password? Click here to reset