Sequential Attention GAN for Interactive Image Editing via Dialogue

12/20/2018
by   Yu Cheng, et al.
0

In this paper, we introduce a new task - interactive image editing via conversational language, where users can guide an agent to edit images via multi-turn dialogue in natural language. In each dialogue turn, the agent takes a source image and a natural language description from the user as the input, and generates a target image following the textual description. Two new datasets are created for this task,Zap-Seq and DeepFashion-Seq, collected via crowdsourcing. For this task, we propose a new Sequential Attention Genrative Adversarial Network (SeqAttnGAN) framework, which applies a neural state tracker to encode both source image and textual descriptions, and generates high quality images in each dialogue turn. To achieve better region specific text-to-image generation, we also introducean attention mechanism into the model. Experiments on the two datasets, including quantitative evaluation and user study, show that our model outperforms state-of-the-art ap-proaches in both image quality and text-to-image consistency.

READ FULL TEXT

page 7

page 8

research
03/20/2023

I2Edit: Towards Multi-turn Interactive Image Editing via Dialogue

Although there have been considerable research efforts on controllable f...
research
09/19/2021

ComicGAN: Text-to-Comic Generative Adversarial Network

Drawing and annotating comic illustrations is a complex and difficult pr...
research
11/16/2017

Language-Based Image Editing with Recurrent Attentive Models

We investigate the problem of Language-Based Image Editing (LBIE) in thi...
research
12/03/2018

A System for Automated Image Editing from Natural Language Commands

This work presents the task of modifying images in an image editing prog...
research
02/16/2020

A Multimodal Dialogue System for Conversational Image Editing

In this paper, we present a multimodal dialogue system for Conversationa...
research
04/18/2022

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance

Generating and editing images from open domain text prompts is a challen...
research
07/18/2023

PromptMagician: Interactive Prompt Engineering for Text-to-Image Creation

Generative text-to-image models have gained great popularity among the p...

Please sign up or login with your details

Forgot password? Click here to reset