ResViT: Residual vision transformers for multi-modal medical image synthesis

by   Onat Dalmaz, et al.

Multi-modal imaging is a key healthcare technology in the diagnosis and management of disease, but it is often underutilized due to costs associated with multiple separate scans. This limitation yields the need for synthesis of unacquired modalities from the subset of available modalities. In recent years, generative adversarial network (GAN) models with superior depiction of structural details have been established as state-of-the-art in numerous medical image synthesis tasks. However, GANs are characteristically based on convolutional neural network (CNN) backbones that perform local processing with compact filters. This inductive bias, in turn, compromises learning of long-range spatial dependencies. While attention maps incorporated in GANs can multiplicatively modulate CNN features to emphasize critical image regions, their capture of global context is mostly implicit. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers. Based on an encoder-decoder architecture, ResViT employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine convolutional and transformer modules. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI and CT images from MRI. Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.


page 4

page 5

page 10

page 12

page 14

page 15


TransMed: Transformers Advance Multi-modal Medical Image Classification

Over the past decade, convolutional neural networks (CNN) have shown ver...

Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images

Semantic segmentation of brain tumors is a fundamental medical image ana...

DiamondGAN: Unified Multi-Modal Generative Adversarial Networks for MRI Sequences Synthesis

Recent studies on medical image synthesis reported promising results usi...

A Transformer-based Generative Adversarial Network for Brain Tumor Segmentation

Brain tumor segmentation remains a challenge in medical image segmentati...

SkrGAN: Sketching-rendering Unconditional Generative Adversarial Networks for Medical Image Synthesis

Generative Adversarial Networks (GANs) have the capability of synthesizi...

STransGAN: An Empirical Study on Transformer in GANs

Transformer becomes prevalent in computer vision, especially for high-le...

One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation

Multi-contrast magnetic resonance imaging (MRI) is widely used in clinic...

Please sign up or login with your details

Forgot password? Click here to reset