A Prompt Log Analysis of Text-to-Image Generation Systems

by   Yutong Xie, et al.

Recent developments in large language models (LLM) and generative AI have unleashed the astonishing capabilities of text-to-image generation systems to synthesize high-quality images that are faithful to a given reference text, known as a "prompt". These systems have immediately received lots of attention from researchers, creators, and common users. Despite the plenty of efforts to improve the generative models, there is limited work on understanding the information needs of the users of these systems at scale. We conduct the first comprehensive analysis of large-scale prompt logs collected from multiple text-to-image generation systems. Our work is analogous to analyzing the query logs of Web search engines, a line of work that has made critical contributions to the glory of the Web search industry and research. Compared with Web search queries, text-to-image prompts are significantly longer, often organized into special structures that consist of the subject, form, and intent of the generation tasks and present unique categories of information needs. Users make more edits within creation sessions, which present remarkable exploratory patterns. There is also a considerable gap between the user-input prompts and the captions of the images included in the open training data of the generative models. Our findings provide concrete implications on how to improve text-to-image generation systems for creation purposes.


Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models

Text-to-image generative models have demonstrated remarkable capabilitie...

A Taxonomy of Prompt Modifiers for Text-To-Image Generation

Text-to-image generation has seen an explosion of interest since 2021. T...

DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data

Denoising diffusion probabilistic models (DDPMs) have been proven capabl...

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

The field of text-to-image (T2I) generation has garnered significant att...

DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity

The unprecedented photorealistic results achieved by recent text-to-imag...

"An Adapt-or-Die Type of Situation": Perception, Adoption, and Use of Text-To-Image-Generation AI by Game Industry Professionals

Text-to-image generation (TTIG) models, a recent addition to creative AI...

An Assessment of ChatGPT on Log Data

Recent development of large language models (LLMs), such as ChatGPT has ...

Please sign up or login with your details

Forgot password? Click here to reset