Directed Diffusion: Direct Control of Object Placement through Attention Guidance

02/25/2023
by   Wan-Duo Kurt Ma, et al.
0

Text-guided diffusion models such as DALLE-2, IMAGEN, and Stable Diffusion are able to generate an effectively endless variety of images given only a short text prompt describing the desired image content. In many cases the images are very high quality as well. However, these models often struggle to compose scenes containing several key objects such as characters in specified positional relationships. Unfortunately, this capability to “direct” the placement of characters and objects both within and across images is crucial in storytelling, as recognized in the literature on film and animation theory. In this work we take a particularly straightforward approach to providing the needed direction, by injecting “activation” at desired positions in the cross-attention maps corresponding to the objects under control, while attenuating the remainder of the map. The resulting approach is a step toward generalizing the applicability of text-guided diffusion models beyond single images to collections of related images, as in storybooks. To the best of our knowledge, our Directed Diffusion method is the first diffusion technique that provides positional control over multiple objects, while making use of an existing pre-trained model and maintaining a coherent blend between the positioned objects and the background. Moreover, it requires only a few lines to implement.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 8

page 9

research
03/01/2023

Collage Diffusion

Text-conditional diffusion models generate high-quality, diverse images....
research
06/03/2022

Compositional Visual Generation with Composable Diffusion Models

Large text-guided diffusion models, such as DALLE-2, are able to generat...
research
03/22/2023

Pix2Video: Video Editing using Image Diffusion

Image diffusion models, trained on massive image collections, have emerg...
research
06/29/2023

Generate Anything Anywhere in Any Scene

Text-to-image diffusion models have attracted considerable interest due ...
research
12/01/2022

Shape-Guided Diffusion with Inside-Outside Attention

Shape can specify key object constraints, yet existing text-to-image dif...
research
11/21/2022

Investigating Prompt Engineering in Diffusion Models

With the spread of the use of Text2Img diffusion models such as DALL-E 2...
research
06/13/2023

Adding 3D Geometry Control to Diffusion Models

Diffusion models have emerged as a powerful method of generative modelin...

Please sign up or login with your details

Forgot password? Click here to reset