From Patches to Objects: Exploiting Spatial Reasoning for Better Visual Representations

05/21/2023
by   Toni Albert, et al.
0

As the field of deep learning steadily transitions from the realm of academic research to practical application, the significance of self-supervised pretraining methods has become increasingly prominent. These methods, particularly in the image domain, offer a compelling strategy to effectively utilize the abundance of unlabeled image data, thereby enhancing downstream tasks' performance. In this paper, we propose a novel auxiliary pretraining method that is based on spatial reasoning. Our proposed method takes advantage of a more flexible formulation of contrastive learning by introducing spatial reasoning as an auxiliary task for discriminative self-supervised methods. Spatial Reasoning works by having the network predict the relative distances between sampled non-overlapping patches. We argue that this forces the network to learn more detailed and intricate internal representations of the objects and the relationships between their constituting parts. Our experiments demonstrate substantial improvement in downstream performance in linear evaluation compared to similar work and provide directions for further research into spatial reasoning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2022

Location-Aware Self-Supervised Transformers

Pixel-level labels are particularly expensive to acquire. Hence, pretrai...
research
12/07/2021

Auxiliary Learning for Self-Supervised Video Representation via Similarity-based Knowledge Distillation

Despite the outstanding success of self-supervised pretraining methods f...
research
06/10/2020

Self-Supervised Relational Reasoning for Representation Learning

In self-supervised learning, a system is tasked with achieving a surroga...
research
01/04/2022

Sound and Visual Representation Learning with Multiple Pretraining Tasks

Different self-supervised tasks (SSL) reveal different features from the...
research
04/27/2021

Contrastive Spatial Reasoning on Multi-View Line Drawings

Spatial reasoning on multi-view line drawings by state-of-the-art superv...
research
04/29/2021

MarioNette: Self-Supervised Sprite Learning

Visual content often contains recurring elements. Text is made up of gly...
research
09/07/2022

Prior Knowledge-Guided Attention in Self-Supervised Vision Transformers

Recent trends in self-supervised representation learning have focused on...

Please sign up or login with your details

Forgot password? Click here to reset