Robustness Analysis of Visual QA Models by Basic Questions

by   Jia-Hong Huang, et al.

Visual Question Answering (VQA) models should have both high robustness and accuracy. Unfortunately, most of the current VQA research only focuses on accuracy because there is a lack of proper methods to measure the robustness of VQA models. There are two main modules in our algorithm. Given a natural language question about an image, the first module takes the question as input and then outputs the ranked basic questions, with similarity scores, of the main given question. The second module takes the main question, image and these basic questions as input and then outputs the text-based answer of the main question about the given image. We claim that a robust VQA model is one, whose performance is not changed much when related basic questions as also made available to it as input. We formulate the basic questions generation problem as a LASSO optimization, and also propose a large scale Basic Question Dataset (BQD) and Rscore (novel robustness measure), for analyzing the robustness of VQA models. We hope our BQD will be used as a benchmark for to evaluate the robustness of VQA models, so as to help the community build more robust and accurate VQA models.


page 1

page 2

page 3

page 4


VQABQ: Visual Question Answering by Basic Questions

Taking an image and question as the input of our method, it can output t...

A Novel Framework for Robustness Analysis of Visual QA Models

Deep neural networks have been playing an essential role in many compute...

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering

Benefiting from the advancement of computer vision, natural language pro...

Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions

Deep learning algorithms have shown promising results in visual question...

Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions

Deep neural networks have been critical in the task of Visual Question A...

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

While models for Visual Question Answering (VQA) have steadily improved ...

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

Visual question answering requires a system to provide an accurate natur...

Please sign up or login with your details

Forgot password? Click here to reset