Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions

04/06/2023
by   Jia-Hong Huang, et al.
0

Deep neural networks have been critical in the task of Visual Question Answering (VQA), with research traditionally focused on improving model accuracy. Recently, however, there has been a trend towards evaluating the robustness of these models against adversarial attacks. This involves assessing the accuracy of VQA models under increasing levels of noise in the input, which can target either the image or the proposed query question, dubbed the main question. However, there is currently a lack of proper analysis of this aspect of VQA. This work proposes a new method that utilizes semantically related questions, referred to as basic questions, acting as noise to evaluate the robustness of VQA models. It is hypothesized that as the similarity of a basic question to the main question decreases, the level of noise increases. To generate a reasonable noise level for a given main question, a pool of basic questions is ranked based on their similarity to the main question, and this ranking problem is cast as a LASSO optimization problem. Additionally, this work proposes a novel robustness measure, R_score, and two basic question datasets to standardize the analysis of VQA model robustness. The experimental results demonstrate that the proposed evaluation method effectively analyzes the robustness of VQA models. Moreover, the experiments show that in-context learning with a chain of basic questions can enhance model accuracy.

READ FULL TEXT
research
11/30/2019

Assessing the Robustness of Visual Question Answering

Deep neural networks have been playing an essential role in the task of ...
research
11/16/2017

A Novel Framework for Robustness Analysis of Visual QA Models

Deep neural networks have been playing an essential role in many compute...
research
09/14/2017

Robustness Analysis of Visual QA Models by Basic Questions

Visual Question Answering (VQA) models should have both high robustness ...
research
06/11/2020

Exploring Weaknesses of VQA Models through Attribution Driven Insights

Deep Neural Networks have been successfully used for the task of Visual ...
research
10/11/2021

Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking

On the way towards general Visual Question Answering (VQA) systems that ...
research
04/16/2016

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

This paper proposes deep convolutional network models that utilize local...
research
07/19/2023

A reinforcement learning approach for VQA validation: an application to diabetic macular edema grading

Recent advances in machine learning models have greatly increased the pe...

Please sign up or login with your details

Forgot password? Click here to reset