AttentionRNN: A Structured Spatial Attention Mechanism

05/22/2019
by   Siddhesh Khandelwal, et al.
31

Visual attention mechanisms have proven to be integrally important constituent components of many modern deep neural architectures. They provide an efficient and effective way to utilize visual information selectively, which has shown to be especially valuable in multi-modal learning tasks. However, all prior attention frameworks lack the ability to explicitly model structural dependencies among attention variables, making it difficult to predict consistent attention masks. In this paper we develop a novel structured spatial attention mechanism which is end-to-end trainable and can be integrated with any feed-forward convolutional neural network. This proposed AttentionRNN layer explicitly enforces structure over the spatial attention variables by sequentially predicting attention values in the spatial mask in a bi-directional raster-scan and inverse raster-scan order. As a result, each attention value depends not only on local image or contextual information, but also on the previously predicted attention values. Our experiments show consistent quantitative and qualitative improvements on a variety of recognition tasks and datasets; including image categorization, question answering and image generation.

READ FULL TEXT

page 13

page 14

page 15

page 16

page 17

page 18

page 19

page 20

research
05/14/2018

Deep Attentional Structured Representation Learning for Visual Recognition

Structured representations, such as Bags of Words, VLAD and Fisher Vecto...
research
12/25/2018

Attention Branch Network: Learning of Attention Mechanism for Visual Explanation

Visual explanation enables human to understand the decision making of De...
research
03/25/2018

Pay More Attention - Neural Architectures for Question-Answering

Machine comprehension is a representative task of natural language under...
research
06/01/2020

Multimodal grid features and cell pointers for Scene Text Visual Question Answering

This paper presents a new model for the task of scene text visual questi...
research
02/19/2022

HDAM: Heuristic Difference Attention Module for Convolutional Neural Networks

The attention mechanism is one of the most important priori knowledge to...
research
02/03/2017

Structured Attention Networks

Attention networks have proven to be an effective approach for embedding...
research
07/25/2021

Improving Robot Localisation by Ignoring Visual Distraction

Attention is an important component of modern deep learning. However, le...

Please sign up or login with your details

Forgot password? Click here to reset