Medical visual question answering using joint self-supervised learning

02/25/2023
by   Yuan Zhou, et al.
0

Visual Question Answering (VQA) becomes one of the most active research problems in the medical imaging domain. A well-known VQA challenge is the intrinsic diversity between the image and text modalities, and in the medical VQA task, there is another critical problem relying on the limited size of labelled image-question-answer data. In this study we propose an encoder-decoder framework that leverages the image-text joint representation learned from large-scaled medical image-caption data and adapted to the small-sized medical VQA task. The encoder embeds across the image-text dual modalities with self-attention mechanism and is independently pre-trained on the large-scaled medical image-caption dataset by multiple self-supervised learning tasks. Then the decoder is connected to the top of the encoder and fine-tuned using the small-sized medical VQA dataset. The experiment results present that our proposed method achieves better performance comparing with the baseline and SOTA methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2021

Medical Visual Question Answering: A Survey

Medical Visual Question Answering (VQA) is a combination of medical arti...
research
04/03/2021

MMBERT: Multimodal BERT Pretraining for Improved Medical VQA

Images in the medical domain are fundamentally different from the genera...
research
05/17/2023

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

In this paper, we focus on the problem of Medical Visual Question Answer...
research
07/27/2023

Med-Flamingo: a Multimodal Medical Few-shot Learner

Medicine, by its nature, is a multifaceted domain that requires the synt...
research
07/26/2022

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Visual question answering (VQA) often requires an understanding of visua...
research
09/26/2019

Overcoming Data Limitation in Medical Visual Question Answering

Traditional approaches for Visual Question Answering (VQA) require large...
research
02/28/2023

VQA with Cascade of Self- and Co-Attention Blocks

The use of complex attention modules has improved the performance of the...

Please sign up or login with your details

Forgot password? Click here to reset