Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

02/15/2018
by   Dong Huk Park, et al.
1

Deep models that are both effective and explainable are desirable in many settings; prior explainable models have been unimodal, offering either image-based visualization of attention weights or text-based generation of post-hoc justifications. We propose a multimodal approach to explanation, and argue that the two modalities provide complementary explanatory strengths. We collect two new datasets to define and evaluate this task, and propose a novel model which can provide joint textual rationale generation and attention visualization. Our datasets define visual and textual justifications of a classification decision for activity recognition tasks (ACT-X) and for visual question answering tasks (VQA-X). We quantitatively show that training with the textual explanations not only yields better textual justification models, but also better localizes the evidence that supports the decision. We also qualitatively show cases where visual explanation is more insightful than textual explanation, and vice versa, supporting our thesis that multimodal explanation models offer significant benefits over unimodal approaches.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 9

research
11/17/2017

Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)

Deep models are the defacto standard in visual decision problems due to ...
research
04/29/2021

A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations

Explainable deep learning models are advantageous in many situations. Pr...
research
12/14/2016

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

Deep models are the defacto standard in visual decision models due to th...
research
12/18/2017

Visual Explanations from Hadamard Product in Multimodal Deep Networks

The visual explanation of learned representation of models helps to unde...
research
08/07/2019

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks

To verify and validate networks, it is essential to gain insight into th...
research
12/11/2022

Multimodal and Explainable Internet Meme Classification

Warning: this paper contains content that may be offensive or upsetting....
research
08/25/2021

Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach

Self-explainable deep models are devised to represent the hidden concept...

Please sign up or login with your details

Forgot password? Click here to reset