CD-CTFM: A Lightweight CNN-Transformer Network for Remote Sensing Cloud Detection Fusing Multiscale Features
Clouds in remote sensing images inevitably affect information extraction, which hinder the following analysis of satellite images. Hence, cloud detection is a necessary preprocessing procedure. However, the existing methods have numerous calculations and parameters. In this letter, a lightweight CNN-Transformer network, CD-CTFM, is proposed to solve the problem. CD-CTFM is based on encoder-decoder architecture and incorporates the attention mechanism. In the decoder part, we utilize a lightweight network combing CNN and Transformer as backbone, which is conducive to extract local and global features simultaneously. Moreover, a lightweight feature pyramid module is designed to fuse multiscale features with contextual information. In the decoder part, we integrate a lightweight channel-spatial attention module into each skip connection between encoder and decoder, extracting low-level features while suppressing irrelevant information without introducing many parameters. Finally, the proposed model is evaluated on two cloud datasets, 38-Cloud and MODIS. The results demonstrate that CD-CTFM achieves comparable accuracy as the state-of-art methods. At the same time, CD-CTFM outperforms state-of-art methods in terms of efficiency.
READ FULL TEXT