MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation

by   Liang Xu, et al.

The UNet architecture, based on Convolutional Neural Networks (CNN), has demonstrated its remarkable performance in medical image analysis. However, it faces challenges in capturing long-range dependencies due to the limited receptive fields and inherent bias of convolutional operations. Recently, numerous transformer-based techniques have been incorporated into the UNet architecture to overcome this limitation by effectively capturing global feature correlations. However, the integration of the Transformer modules may result in the loss of local contextual information during the global feature fusion process. To overcome these challenges, we propose a 2D medical image segmentation model called Multi-scale Cross Perceptron Attention Network (MCPA). The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron. The Cross Perceptron first captures the local correlations using multiple Multi-scale Cross Perceptron modules, facilitating the fusion of features across scales. The resulting multi-scale feature vectors are then spatially unfolded, concatenated, and fed through a Global Perceptron module to model global dependencies. Furthermore, we introduce a Progressive Dual-branch Structure to address the semantic segmentation of the image involving finer tissue structures. This structure gradually shifts the segmentation focus of MCPA network training from large-scale structural features to more sophisticated pixel-level features. We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices, including the open large-scale dataset of CT (Synapse), MRI (ACDC), fundus camera (DRIVE, CHASE_DB1, HRF), and OCTA (ROSE). The experimental results show that our MCPA model achieves state-of-the-art performance. The code is available at


page 1

page 6

page 7


HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation

Convolutional neural networks (CNNs) have been the consensus for medical...

Multi-scale guided attention for medical image segmentation

Even though convolutional neural networks (CNNs) are driving progress in...

An Efficient Multi-Scale Fusion Network for 3D Organ at Risk (OAR) Segmentation

Accurate segmentation of organs-at-risks (OARs) is a precursor for optim...

Enhancing Medical Image Segmentation with TransCeption: A Multi-Scale Feature Fusion Approach

While CNN-based methods have been the cornerstone of medical image segme...

TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers

Combining information from multi-view images is crucial to improve the p...

GA-HQS: MRI reconstruction via a generically accelerated unfolding approach

Deep unfolding networks (DUNs) are the foremost methods in the realm of ...

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke

Segmenting stroke lesions from T1-weighted MR images is of great value f...

Please sign up or login with your details

Forgot password? Click here to reset