Inferring and Executing Programs for Visual Reasoning

05/10/2017
by   Justin Johnson, et al.
0

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings.

READ FULL TEXT

page 1

page 5

page 6

page 8

research
01/13/2018

Benchmark Visual Question Answer Models by using Focus Map

Inferring and Executing Programs for Visual Reasoning proposes a model f...
research
10/04/2018

Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding

We marry two powerful ideas: deep representation learning for visual rec...
research
04/06/2020

SHOP-VRB: A Visual Reasoning Benchmark for Object Perception

In this paper we present an approach and a benchmark for visual reasonin...
research
02/24/2022

Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

How can we measure the reasoning capabilities of intelligence systems? V...
research
03/08/2018

Compositional Attention Networks for Machine Reasoning

We present the MAC network, a novel fully differentiable neural network ...
research
01/12/2018

Combining Symbolic and Function Evaluation Expressions In Neural Programs

Neural programming involves training neural networks to learn programs f...
research
12/05/2018

Explainable and Explicit Visual Reasoning over Scene Graphs

We aim to dismantle the prevalent black-box neural architectures used in...

Please sign up or login with your details

Forgot password? Click here to reset