TOV: The Original Vision Model for Optical Remote Sensing Image Understanding via Self-supervised Learning

by   Chao Tao, et al.

Do we on the right way for remote sensing image understanding (RSIU) by training models via supervised data-dependent and task-dependent way, instead of human vision in a label-free and task-independent way? We argue that a more desirable RSIU model should be trained with intrinsic structure from data rather that extrinsic human labels to realize generalizability across a wide range of RSIU tasks. According to this hypothesis, we proposed The Original Vision model (TOV) in remote sensing filed. Trained by massive unlabeled optical data along a human-like self-supervised learning (SSL) path that is from general knowledge to specialized knowledge, TOV model can be easily adapted to various RSIU tasks, including scene classification, object detection, and semantic segmentation, and outperforms dominant ImageNet supervised pretrained method as well as two recently proposed SSL pretrained methods on majority of 12 publicly available benchmarks. Moreover, we analyze the influences of two key factors on the performance of building TOV model for RSIU, including the influence of using different data sampling methods and the selection of learning paths during self-supervised optimization. We believe that a general model which is trained by a label-free and task-independent way may be the next paradigm for RSIU and hope the insights distilled from this study can help to foster the development of an original vision model for RSIU.


page 8

page 10


Remote Sensing Image Scene Classification with Self-Supervised Paradigm under Limited Labeled Samples

With the development of deep learning, supervised learning methods perfo...

Lightweight, Pre-trained Transformers for Remote Sensing Timeseries

Machine learning algorithms for parsing remote sensing data have a wide ...

Remote Sensing Scene Classification with Masked Image Modeling (MIM)

Remote sensing scene classification has been extensively studied for its...

Geography-Aware Self-Supervised Learning

Contrastive learning methods have significantly narrowed the gap between...

Supervising Remote Sensing Change Detection Models with 3D Surface Semantics

Remote sensing change detection, identifying changes between scenes of t...

Continual Barlow Twins: continual self-supervised learning for remote sensing semantic segmentation

In the field of Earth Observation (EO), Continual Learning (CL) algorith...

Learning Sentinel-2 reflectance dynamics for data-driven assimilation and forecasting

Over the last few years, massive amounts of satellite multispectral and ...

Please sign up or login with your details

Forgot password? Click here to reset