Joint-task Self-supervised Learning for Temporal Correspondence

09/26/2019
by   Xueting Li, et al.
10

This paper proposes to learn reliable dense correspondence from videos in a self-supervised manner. Our learning process integrates two highly related tasks: tracking large image regions and establishing fine-grained pixel-level associations between consecutive video frames. We exploit the synergy between both tasks through a shared inter-frame affinity matrix, which simultaneously models transitions between video frames at both the region- and pixel-levels. While region-level localization helps reduce ambiguities in fine-grained matching by narrowing down search regions; fine-grained matching provides bottom-up features to facilitate region-level localization. Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a ResNet-18 pre-trained on the ImageNet.

READ FULL TEXT

page 2

page 6

page 7

page 11

page 12

research
12/09/2020

Contrastive Transformation for Self-supervised Correspondence Learning

In this paper, we focus on the self-supervised learning of visual corres...
research
08/06/2023

Learning Fine-Grained Features for Pixel-wise Video Correspondences

Video analysis tasks rely heavily on identifying the pixels from differe...
research
03/29/2022

In-N-Out Generative Learning for Dense Unsupervised Video Segmentation

In this paper, we focus on the unsupervised Video Object Segmentation (V...
research
03/27/2022

Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning

Our target is to learn visual correspondence from unlabeled videos. We d...
research
11/21/2014

Hypercolumns for Object Segmentation and Fine-grained Localization

Recognition algorithms based on convolutional networks (CNNs) typically ...
research
06/17/2021

Efficient Self-supervised Vision Transformers for Representation Learning

This paper investigates two techniques for developing efficient self-sup...
research
06/22/2020

Self-supervised Video Object Segmentation

The objective of this paper is self-supervised representation learning, ...

Please sign up or login with your details

Forgot password? Click here to reset