Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

09/07/2023
by   Beentherize, et al.
0

The choice of the objective function is crucial in emerging high-quality representations from self-supervised learning. This paper investigates how different formulations of the Barlow Twins (BT) objective impact downstream task performance for speech data. We propose Modified Barlow Twins (MBT) with normalized latents to enforce scale-invariance and evaluate on speaker identification, gender recognition and keyword spotting tasks. Our results show MBT improves representation generalization over original BT, especially when fine-tuning with limited target data. This highlights the importance of designing objectives that encourage invariant and transferable representations. Our analysis provides insights into how the BT learning objective can be tailored to produce speech representations that excel when adapted to new downstream tasks. This study is an important step towards developing reusable self-supervised speech representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Self-supervised speech representation models have succeeded in various t...
research
05/29/2023

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

Contrastive self-supervised learning has gained attention for its abilit...
research
08/28/2023

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

Self-supervised learning (SSL) leverages large datasets of unlabeled spe...
research
06/14/2021

A Self-Supervised Framework for Function Learning and Extrapolation

Understanding how agents learn to generalize – and, in particular, to ex...
research
02/06/2023

The SSL Interplay: Augmentations, Inductive Bias, and Generalization

Self-supervised learning (SSL) has emerged as a powerful framework to le...
research
06/01/2023

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

Self-supervised learning (SSL) has recently allowed leveraging large dat...
research
06/01/2023

Exploration on HuBERT with Multiple Resolutions

Hidden-unit BERT (HuBERT) is a widely-used self-supervised learning (SSL...

Please sign up or login with your details

Forgot password? Click here to reset