Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

07/12/2022
by   Théo Lepage, et al.
0

State-of-the-art speaker verification systems are inherently dependent on some kind of human supervision as they are trained on massive amounts of labeled data. However, manually annotating utterances is slow, expensive and not scalable to the amount of data available today. In this study, we explore self-supervised learning for speaker verification by learning representations directly from raw audio. The objective is to produce robust speaker embeddings that have small intra-speaker and large inter-speaker variance. Our approach is based on recent information maximization learning frameworks and an intensive data augmentation pre-processing step. We evaluate the ability of these methods to work without contrastive samples before showing that they achieve better performance when combined with a contrastive loss. Furthermore, we conduct experiments to show that our method reaches competitive results compared to existing techniques and can get better performances compared to a supervised baseline when fine-tuned with a small portion of labeled data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2020

Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning

In this study, we investigate self-supervised representation learning fo...
research
10/28/2022

A comprehensive study on self-supervised distillation for speaker representation learning

In real application scenarios, it is often challenging to obtain a large...
research
02/22/2022

Contrastive-mixup learning for improved speaker verification

This paper proposes a novel formulation of prototypical loss with mixup ...
research
08/15/2021

Self-supervised Contrastive Learning of Multi-view Facial Expressions

Facial expression recognition (FER) has emerged as an important componen...
research
04/12/2023

Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification

Automatic speaker verification task has made great achievements using de...
research
08/15/2022

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Self-supervised learning (SSL) has drawn an increased attention in the f...
research
02/07/2022

Self-supervised Speaker Recognition Training Using Human-Machine Dialogues

Speaker recognition, recognizing speaker identities based on voice alone...

Please sign up or login with your details

Forgot password? Click here to reset