D-Former: A U-shaped Dilated Transformer for 3D Medical Image Segmentation

by   Yixuan Wu, et al.

Computer-aided medical image segmentation has been applied widely in diagnosis and treatment to obtain clinically useful information of shapes and volumes of target organs and tissues. In the past several years, convolutional neural network (CNN) based methods (e.g., U-Net) have dominated this area, but still suffered from inadequate long-range information capturing. Hence, recent work presented computer vision Transformer variants for medical image segmentation tasks and obtained promising performances. Such Transformers model long-range dependency by computing pair-wise patch relations. However, they incur prohibitive computational costs, especially on 3D medical images (e.g., CT and MRI). In this paper, we propose a new method called Dilated Transformer, which conducts self-attention for pair-wise patch relations captured alternately in local and global scopes. Inspired by dilated convolution kernels, we conduct the global self-attention in a dilated manner, enlarging receptive fields without increasing the patches involved and thus reducing computational costs. Based on this design of Dilated Transformer, we construct a U-shaped encoder-decoder hierarchical architecture called D-Former for 3D medical image segmentation. Experiments on the Synapse and ACDC datasets show that our D-Former model, trained from scratch, outperforms various competitive CNN-based or Transformer-based segmentation models at a low computational cost without time-consuming per-training process.


page 6

page 14


Pyramid Medical Transformer for Medical Image Segmentation

Deep neural networks have been a prevailing technique in the field of me...

Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation

Recently, many attempts have been made to construct a transformer base U...

PAANet: Progressive Alternating Attention for Automatic Medical Image Segmentation

Medical image segmentation can provide detailed information for clinical...

Near Real-time Hippocampus Segmentation Using Patch-based Canonical Neural Network

Over the past decades, state-of-the-art medical image segmentation has h...

U-Netmer: U-Net meets Transformer for medical image segmentation

The combination of the U-Net based deep learning models and Transformer ...

SuperPatchMatch: an Algorithm for Robust Correspondences using Superpixel Patches

Superpixels have become very popular in many computer vision application...

The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive Pruning

Vision transformers have recently set off a new wave in the field of med...

Please sign up or login with your details

Forgot password? Click here to reset