Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

11/20/2022
by   Xichen Pan, et al.
0

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity. Recently, most works focus on synthesizing independent images; While for real-world applications, it is common and necessary to generate a series of coherent images for story-stelling. In this work, we mainly focus on story visualization and continuation tasks and propose AR-LDM, a latent diffusion model auto-regressively conditioned on history captions and generated images. Moreover, AR-LDM can generalize to new characters through adaptation. To our best knowledge, this is the first work successfully leveraging diffusion models for coherent visual story synthesizing. Quantitative results show that AR-LDM achieves SoTA FID scores on PororoSV, FlintstonesSV, and the newly introduced challenging dataset VIST containing natural images. Large-scale human evaluations show that AR-LDM has superior performance in terms of quality, relevance, and consistency.

READ FULL TEXT

page 12

page 13

page 14

page 17

page 18

page 20

page 22

page 24

research
02/08/2023

Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models

Recent advancements in large scale text-to-image models have opened new ...
research
09/18/2023

Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning For Visual Story Synthesis

The excellent text-to-image synthesis capability of diffusion models has...
research
05/26/2023

Improved Visual Story Generation with Adaptive Context Modeling

Diffusion models developed on top of powerful text-to-image generation m...
research
05/22/2023

Album Storytelling with Iterative Story-aware Captioning and Large Language Models

This work studies how to transform an album to vivid and coherent storie...
research
05/20/2021

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

Story visualization is an under-explored task that falls at the intersec...
research
11/23/2022

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

There has been a recent explosion of impressive generative models that c...
research
04/30/2020

PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking

We propose the task of outline-conditioned story generation: given an ou...

Please sign up or login with your details

Forgot password? Click here to reset