Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism

by   Zhijian Zhuo, et al.

Recently, a variety of methods under the name of non-contrastive learning (like BYOL, SimSiam, SwAV, DINO) show that when equipped with some asymmetric architectural designs, aligning positive pairs alone is sufficient to attain good performance in self-supervised visual learning. Despite some understandings of some specific modules (like the predictor in BYOL), there is yet no unified theoretical understanding of how these seemingly different asymmetric designs can all avoid feature collapse, particularly considering methods that also work without the predictor (like DINO). In this work, we propose a unified theoretical understanding for existing variants of non-contrastive learning. Our theory named Rank Differential Mechanism (RDM) shows that all these asymmetric designs create a consistent rank difference in their dual-branch output features. This rank difference will provably lead to an improvement of effective dimensionality and alleviate either complete or dimensional feature collapse. Different from previous theories, our RDM theory is applicable to different asymmetric designs (with and without the predictor), and thus can serve as a unified understanding of existing non-contrastive learning methods. Besides, our RDM theory also provides practical guidelines for designing many new non-contrastive variants. We show that these variants indeed achieve comparable performance to existing methods on benchmark datasets, and some of them even outperform the baselines. Our code is available at <>.


page 1

page 2

page 3

page 4


Understanding self-supervised Learning Dynamics without Contrastive Pairs

Contrastive approaches to self-supervised learning (SSL) learn represent...

On the Generalization of Multi-modal Contrastive Learning

Multi-modal contrastive learning (MMCL) has recently garnered considerab...

Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap

Recently, contrastive learning has risen to be a promising approach for ...

Asymmetric Patch Sampling for Contrastive Learning

Asymmetric appearance between positive pair effectively reduces the risk...

A Message Passing Perspective on Learning Dynamics of Contrastive Learning

In recent years, contrastive learning achieves impressive results on sel...

How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders

Masked Autoencoders (MAE) based on a reconstruction task have risen to b...

An Investigation into Whitening Loss for Self-supervised Learning

A desirable objective in self-supervised learning (SSL) is to avoid feat...

Please sign up or login with your details

Forgot password? Click here to reset