Attentive Explanations: Justifying Decisions and Pointing to the Evidence

by   Dong Huk Park, et al.

Deep models are the defacto standard in visual decision models due to their impressive performance on a wide array of visual tasks. However, they are frequently seen as opaque and are unable to explain their decisions. In contrast, humans can justify their decisions with natural language and point to the evidence in the visual world which led to their decisions. We postulate that deep models can do this as well and propose our Pointing and Justification (PJ-X) model which can justify its decision with a sentence and point to the evidence by introspecting its decision and explanation process using an attention mechanism. Unfortunately there is no dataset available with reference explanations for visual decision making. We thus collect two datasets in two domains where it is interesting and challenging to explain decisions. First, we extend the visual question answering task to not only provide an answer but also a natural language explanation for the answer. Second, we focus on explaining human activities which is traditionally more challenging than object classification. We extensively evaluate our PJ-X model, both on the justification and pointing tasks, by comparing it to prior models and ablations using both automatic and human evaluations.


page 5

page 6

page 9

page 11

page 12

page 13

page 14


Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)

Deep models are the defacto standard in visual decision problems due to ...

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

Deep models that are both effective and explainable are desirable in man...

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks

Natural language explanation (NLE) models aim at explaining the decision...

An exact counterfactual-example-based approach to tree-ensemble models interpretability

Explaining the decisions of machine learning models is becoming a necess...

Explaining with Counter Visual Attributes and Examples

In this paper, we aim to explain the decisions of neural networks by uti...

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

While enjoying the great achievements brought by deep learning (DL), peo...

Assisting human experts in the interpretation of their visual process: A case study on assessing copper surface adhesive potency

Deep Neural Networks are often though to lack interpretability due to th...

Please sign up or login with your details

Forgot password? Click here to reset