Graph Neural Network Backend for Speaker Recognition

08/17/2023
by   Liang He, et al.
0

Currently, most speaker recognition backends, such as cosine, linear discriminant analysis (LDA), or probabilistic linear discriminant analysis (PLDA), make decisions by calculating similarity or distance between enrollment and test embeddings which are already extracted from neural networks. However, for each embedding, the local structure of itself and its neighbor embeddings in the low-dimensional space is different, which may be helpful for the recognition but is often ignored. In order to take advantage of it, we propose a graph neural network (GNN) backend to mine latent relationships among embeddings for classification. We assume all the embeddings as nodes on a graph, and their edges are computed based on some similarity function, such as cosine, LDA+cosine, or LDA+PLDA. We study different graph settings and explore variants of GNN to find a better message passing and aggregation way to accomplish the recognition task. Experimental results on NIST SRE14 i-vector challenging, VoxCeleb1-O, VoxCeleb1-E, and VoxCeleb1-H datasets demonstrate that our proposed GNN backends significantly outperform current mainstream methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2018

Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition

Linear Discriminant Analysis (LDA) has been used as a standard post-proc...
research
04/12/2021

Edgeless-GNN: Unsupervised Inductive Edgeless Network Embedding

We study the problem of embedding edgeless nodes such as users who newly...
research
09/15/2021

Learning Robot Structure and Motion Embeddings using Graph Neural Networks

We propose a learning framework to find the representation of a robot's ...
research
03/19/2023

The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework

Pre-trained wav2vec2.0 model has been proved its effectiveness for speak...
research
04/25/2022

Back-ends Selection for Deep Speaker Embeddings

Probabilistic Linear Discriminant Analysis (PLDA) was the dominant and n...
research
08/21/2022

Representation Learning with Graph Neural Networks for Speech Emotion Recognition

Learning expressive representation is crucial in deep learning. In speec...
research
10/30/2020

Deep generative LDA

Linear discriminant analysis (LDA) is a popular tool for classification ...

Please sign up or login with your details

Forgot password? Click here to reset