MMFormer: Multimodal Transformer Using Multiscale Self-Attention for Remote Sensing Image Classification

03/23/2023
by   Bo Zhang, et al.
0

To benefit the complementary information between heterogeneous data, we introduce a new Multimodal Transformer (MMFormer) for Remote Sensing (RS) image classification using Hyperspectral Image (HSI) accompanied by another source of data such as Light Detection and Ranging (LiDAR). Compared with traditional Vision Transformer (ViT) lacking inductive biases of convolutions, we first introduce convolutional layers to our MMFormer to tokenize patches from multimodal data of HSI and LiDAR. Then we propose a Multi-scale Multi-head Self-Attention (MSMHSA) module to address the problem of compatibility which often limits to fuse HSI with high spectral resolution and LiDAR with relatively low spatial resolution. The proposed MSMHSA module can incorporate HSI to LiDAR data in a coarse-to-fine manner enabling us to learn a fine-grained representation. Extensive experiments on widely used benchmarks (e.g., Trento and MUUFL) demonstrate the effectiveness and superiority of our proposed MMFormer for RS image classification.

READ FULL TEXT

page 3

page 4

research
04/21/2022

Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval

Remote sensing (RS) cross-modal text-image retrieval has attracted exten...
research
10/28/2022

Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images

Very high-resolution (VHR) remote sensing (RS) image classification is t...
research
06/03/2023

Lightweight Structure-aware Transformer Network for VHR Remote Sensing Image Change Detection

Popular Transformer networks have been successfully applied to remote se...
research
03/03/2022

ViTransPAD: Video Transformer using convolution and self-attention for Face Presentation Attack Detection

Face Presentation Attack Detection (PAD) is an important measure to prev...
research
09/16/2023

RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework

In recent years, remote sensing (RS) vision foundation models such as Ri...
research
04/18/2023

Multi-Modality Multi-Scale Cardiovascular Disease Subtypes Classification Using Raman Image and Medical History

Raman spectroscopy (RS) has been widely used for disease diagnosis, e.g....
research
09/04/2023

Locality-Aware Hyperspectral Classification

Hyperspectral image classification is gaining popularity for high-precis...

Please sign up or login with your details

Forgot password? Click here to reset