Layout Generation and Completion with Self-attention

by   Kamal Gupta, et al.

We address the problem of layout generation for diverse domains such as images, documents, and mobile applications. A layout is a set of graphical elements, belonging to one or more categories, placed together in a meaningful way. Generating a new layout or extending an existing layout requires understanding the relationships between these graphical elements. To do this, we propose a novel framework, LayoutTransformer, that leverages a self-attention based approach to learn contextual relationships between layout elements and generate layouts in a given domain. The proposed model improves upon the state-of-the-art approaches in layout generation in four ways. First, our model can generate a new layout either from an empty set or add more elements to a partial layout starting from an initial set of elements. Second, as the approach is attention-based, we can visualize which previous elements the model is attending to predict the next element, thereby providing an interpretable sequence of layout elements. Third, our model can easily scale to support both a large number of element categories and a large number of elements per layout. Finally, the model also produces an embedding for various element categories, which can be used to explore the relationships between the categories. We demonstrate with experiments that our model can produce meaningful layouts in diverse settings such as object bounding boxes in scenes (COCO bounding boxes), documents (PubLayNet), and mobile applications (RICO dataset).


page 2

page 8

page 13

page 20

page 21

page 22

page 23


Generative Layout Modeling using Constraint Graphs

We propose a new generative model for layout generation. We generate lay...

LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators

Layout is important for graphic design and scene generation. We propose ...

Geometry Aligned Variational Transformer for Image-conditioned Layout Generation

Layout generation is a novel task in computer vision, which combines the...

READ: Recursive Autoencoders for Document Layout Generation

Layout is a fundamental component of any graphic design. Creating large ...

LayoutBERT: Masked Language Layout Model for Object Insertion

Image compositing is one of the most fundamental steps in creative workf...

Bib2vec: An Embedding-based Search System for Bibliographic Information

We propose a novel embedding model that represents relationships among s...

StructuredMesh: 3D Structured Optimization of Façade Components on Photogrammetric Mesh Models using Binary Integer Programming

The lack of façade structures in photogrammetric mesh models renders the...

Please sign up or login with your details

Forgot password? Click here to reset