Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

04/08/2022
by   Qiongqiong Wang, et al.
0

The emergence of large-margin softmax cross-entropy losses in training deep speaker embedding neural networks has triggered a gradual shift from parametric back-ends to a simpler cosine similarity measure for speaker verification. Popular parametric back-ends include the probabilistic linear discriminant analysis (PLDA) and its variants. This paper investigates the properties of margin-based cross-entropy losses leading to such a shift and aims to find scoring back-ends best suited for speaker verification. In addition, we revisit the pre-processing techniques which have been widely used in the past and assess their effectiveness on large-margin embeddings. Experiments on the state-of-the-art ECAPA-TDNN networks trained with various large-margin softmax cross-entropy losses show a substantial increment in intra-speaker compactness making the conventional PLDA superfluous. In this regard, we found that constraining the within-speaker covariance matrix could improve the performance of the PLDA. It is demonstrated through a series of experiments on the VoxCeleb-1 and SITW core-core test sets with 40.8 reduction and 35.1 outperforms cosine scoring consistently with reductions in EER and minDCF by 10.9

READ FULL TEXT
research
04/06/2019

Large Margin Softmax Loss for Speaker Verification

In neural network based speaker verification, speaker embedding is expec...
research
06/18/2019

Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Recently, speaker embeddings extracted from a speaker discriminative dee...
research
11/10/2019

Improved Large-margin Softmax Loss for Speaker Diarisation

Speaker diarisation systems nowadays use embeddings generated from speec...
research
04/25/2022

Back-ends Selection for Deep Speaker Embeddings

Probabilistic Linear Discriminant Analysis (PLDA) was the dominant and n...
research
10/27/2022

Toroidal Probabilistic Spherical Discriminant Analysis

In speaker recognition, where speech segments are mapped to embeddings o...
research
04/22/2022

Unifying Cosine and PLDA Back-ends for Speaker Verification

State-of-art speaker verification (SV) systems use a back-end model to s...
research
07/01/2019

Cosine similarity-based adversarial process

An adversarial process between two deep neural networks is a promising a...

Please sign up or login with your details

Forgot password? Click here to reset