Procedural Reasoning Networks for Understanding Multimodal Procedures

09/19/2019
by   Mustafa Sercan Amac, et al.
6

This paper addresses the problem of comprehending procedural commonsense knowledge. This is a challenging task as it requires identifying key entities, keeping track of their state changes, and understanding temporal and causal relations. Contrary to most of the previous work, in this study, we do not rely on strong inductive bias and explore the question of how multimodality can be exploited to provide a complementary semantic signal. Towards this end, we introduce a new entity-aware neural comprehension model augmented with external relational memory units. Our model learns to dynamically update entity states in relation to each other while reading the text instructions. Our experimental analysis on the visual reasoning tasks in the recently proposed RecipeQA dataset reveals that our approach improves the accuracy of the previously reported models by a large margin. Moreover, we find that our model learns effective dynamic representations of entities even though we do not use any supervision at the level of entity states.

READ FULL TEXT

page 2

page 3

page 7

research
10/12/2018

Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

We propose a neural machine-reading model that constructs dynamic knowle...
research
05/27/2021

Relational Gating for "What If" Reasoning

This paper addresses the challenge of learning to do procedural reasonin...
research
12/12/2016

Reading Comprehension using Entity-based Memory Network

This paper introduces a novel neural network model for question answerin...
research
04/06/2022

Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension

Procedural Multimodal Documents (PMDs) organize textual instructions and...
research
03/15/2022

Procedural Text Understanding via Scene-Wise Evolution

Procedural text understanding requires machines to reason about entity s...
research
06/04/2021

MERLOT: Multimodal Neural Script Knowledge Models

As humans, we understand events in the visual world contextually, perfor...
research
05/17/2018

Tracking State Changes in Procedural Text: A Challenge Dataset and Models for Process Paragraph Comprehension

We present a new dataset and models for comprehending paragraphs about p...

Please sign up or login with your details

Forgot password? Click here to reset