Data Generation Using Pass-phrase-dependent Deep Auto-encoders for Text-Dependent Speaker Verification

02/03/2021
by   Achintya Kumar Sarkar, et al.
0

In this paper, we propose a novel method that trains pass-phrase specific deep neural network (PP-DNN) based auto-encoders for creating augmented data for text-dependent speaker verification (TD-SV). Each PP-DNN auto-encoder is trained using the utterances of a particular pass-phrase available in the target enrollment set with two methods: (i) transfer learning and (ii) training from scratch. Next, feature vectors of a given utterance are fed to the PP-DNNs and the output from each PP-DNN at frame-level is considered one new set of generated data. The generated data from each PP-DNN is then used for building a TD-SV system in contrast to the conventional method that considers only the evaluation data available. The proposed approach can be considered as the transformation of data to the pass-phrase specific space using a non-linear transformation learned by each PP-DNN. The method develops several TD-SV systems with the number equal to the number of PP-DNNs separately trained for each pass-phrases for the evaluation. Finally, the scores of the different TD-SV systems are fused for decision making. Experiments are conducted on the RedDots challenge 2016 database for TD-SV using short utterances. Results show that the proposed method improves the performance for both conventional cepstral feature and deep bottleneck feature using both Gaussian mixture model - universal background model (GMM-UBM) and i-vector framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2016

Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker Verification

In this paper, we propose pass-phrase dependent background models (PBMs)...
research
05/11/2019

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification

There are a number of studies about extraction of bottleneck (BN) featur...
research
10/28/2017

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verif...
research
09/27/2016

Decision Making Based on Cohort Scores for Speaker Verification

Decision making is an important component in a speaker verification syst...
research
11/25/2020

Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding

In this letter, we propose a vocal tract length (VTL) perturbation metho...
research
01/17/2022

On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification

Deep representation learning has gained significant momentum in advancin...
research
04/06/2021

Speaker Diarization using Two-pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings

Many modern systems for speaker diarization, such as the recently-develo...

Please sign up or login with your details

Forgot password? Click here to reset