Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers

10/16/2022
by   Tao Tang, et al.
3

Automatic data augmentation (AutoAugment) strategies are indispensable in supervised data-efficient training protocols of vision transformers, and have led to state-of-the-art results in supervised learning. Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space. In this work, we propose AutoView, a self-regularized adversarial AutoAugment method, to learn views for self-supervised vision transformers, by addressing the above barriers. First, we reduce the search cost of AutoView to nearly zero by learning views and network parameters simultaneously in a single forward-backward step, minimizing and maximizing the mutual information among different augmented views, respectively. Then, to avoid information collapse caused by the lack of label supervision, we propose a self-regularized loss term to guarantee the information propagation. Additionally, we present a curated augmentation policy search space for self-supervised learning, by modifying the generally used search space designed for supervised learning. On ImageNet, our AutoView achieves remarkable improvement over RandAug baseline (+10.2 and consistently outperforms sota manually tuned view policy by a clear margin (up to +1.3 pretraining also benefits downstream tasks (+1.2 Segmentation and +2.8 improves model robustness (+2.3 ImageNet-O). Code and models will be available at https://github.com/Trent-tangtao/AutoView.

READ FULL TEXT

page 9

page 11

research
04/08/2021

SiT: Self-supervised vIsion Transformer

Self-supervised learning methods are gaining increasing traction in comp...
research
03/23/2021

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

A myriad of recent breakthroughs in hand-crafted neural architectures fo...
research
09/13/2023

Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Convolutional networks and vision transformers have different forms of p...
research
05/30/2022

GMML is All you Need

Vision transformers have generated significant interest in the computer ...
research
11/17/2022

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

The superior performance of modern deep networks usually comes at the pr...
research
06/09/2022

Spatial Entropy Regularization for Vision Transformers

Recent work has shown that the attention maps of Vision Transformers (VT...
research
06/03/2019

Learning Representations by Maximizing Mutual Information Across Views

We propose an approach to self-supervised representation learning based ...

Please sign up or login with your details

Forgot password? Click here to reset