Feature Dropout: Revisiting the Role of Augmentations in Contrastive Learning

12/16/2022
by   Alex Tamkin, et al.
0

What role do augmentations play in contrastive learning? Recent work suggests that good augmentations are label-preserving with respect to a specific downstream task. We complicate this picture by showing that label-destroying augmentations can be useful in the foundation model setting, where the goal is to learn diverse, general-purpose representations for multiple downstream tasks. We perform contrastive learning experiments on a range of image and audio datasets with multiple downstream tasks (e.g. for digits superimposed on photographs, predicting the class of one vs. the other). We find that Viewmaker Networks, a recently proposed model for learning augmentations for contrastive learning, produce label-destroying augmentations that stochastically destroy features needed for different downstream tasks. These augmentations are interpretable (e.g. altering shapes, digits, or letters added to images) and surprisingly often result in better performance compared to expert-designed augmentations, despite not preserving label information. To support our empirical results, we theoretically analyze a simple contrastive learning setting with a linear model. In this setting, label-destroying augmentations are crucial for preventing one set of features from suppressing the learning of features useful for another downstream task. Our results highlight the need for analyzing the interaction between multiple downstream tasks when trying to explain the success of foundation models.

READ FULL TEXT

page 5

page 16

research
02/28/2022

Understanding Contrastive Learning Requires Incorporating Inductive Biases

Contrastive learning is a popular form of self-supervised learning that ...
research
05/20/2020

Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere

Contrastive representation learning has been outstandingly successful in...
research
10/01/2021

Stochastic Contrastive Learning

While state-of-the-art contrastive Self-Supervised Learning (SSL) models...
research
11/20/2022

Can Single-Pass Contrastive Learning Work for Both Homophilic and Heterophilic Graph?

Existing graph contrastive learning (GCL) typically requires two forward...
research
11/15/2022

Masked Reconstruction Contrastive Learning with Information Bottleneck Principle

Contrastive learning (CL) has shown great power in self-supervised learn...
research
10/12/2022

One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks

Current multimodal models, aimed at solving Vision and Language (V+L) ta...
research
12/15/2020

Understanding the Behaviour of Contrastive Loss

Unsupervised contrastive learning has achieved outstanding success, whil...

Please sign up or login with your details

Forgot password? Click here to reset