Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base

07/27/2022
by   Jinyeong Chae, et al.
0

Knowledge-based visual question answering (KVQA) task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a multi-modal form, and as more knowledge is used, irrelevant information may be added and can confuse the question answering. In order to properly use the knowledge, this study proposes the following: 1) we introduce a novel semantic inconsistency measure computed from caption uncertainty and semantic similarity; 2) we suggest a new external knowledge assimilation method based on the semantic inconsistency measure and apply it to integrate explicit knowledge and implicit knowledge for KVQA; 3) the proposed method is evaluated with the OK-VQA dataset and achieves the state-of-the-art performance.

READ FULL TEXT

page 1

page 7

research
05/31/2019

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

Visual Question Answering (VQA) in its ideal form lets us study reasonin...
research
03/23/2021

Multi-Modal Answer Validation for Knowledge-Based VQA

The problem of knowledge-based visual question answering involves answer...
research
02/11/2023

Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis

Often, deep network models are purely inductive during training and whil...
research
03/09/2016

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Much recent progress in Vision-to-Language problems has been achieved th...
research
04/08/2017

An Empirical Evaluation of Visual Question Answering for Novel Objects

We study the problem of answering questions about images in the harder s...
research
06/13/2023

AVIS: Autonomous Visual Information Seeking with Large Language Models

In this paper, we propose an autonomous information seeking visual quest...
research
06/22/2016

Semantic Parsing to Probabilistic Programs for Situated Question Answering

Situated question answering is the problem of answering questions about ...

Please sign up or login with your details

Forgot password? Click here to reset