Three-Stage Speaker Verification Architecture in Emotional Talking Environments

09/03/2018
by   Ismail Shahin, et al.
0

Speaker verification performance in neutral talking environment is usually high, while it is sharply decreased in emotional talking environments. This performance degradation in emotional environments is due to the problem of mismatch between training in neutral environment while testing in emotional environments. In this work, a three-stage speaker verification architecture has been proposed to enhance speaker verification performance in emotional environments. This architecture is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage. The proposed framework has been evaluated on two distinct and independent emotional speech datasets: in-house dataset and Emotional Prosody Speech and Transcripts dataset. Our results show that speaker verification based on both gender information and emotion information is superior to each of speaker verification based on gender information only, emotion information only, and neither gender information nor emotion information. The attained average speaker verification performance based on the proposed framework is very alike to that attained in subjective assessment by human listeners.

READ FULL TEXT
research
03/31/2018

Speaker Verification in Emotional Talking Environments based on Three-Stage Framework

This work is dedicated to introducing, executing, and assessing a three-...
research
06/29/2017

Employing both Gender and Emotion Cues to Enhance Speaker Identification Performance in Emotional Talking Environments

Speaker recognition performance in emotional talking environments is not...
research
10/23/2022

Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG

Speech signals are subjected to more acoustic interference and emotional...
research
07/01/2017

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Usually, people talk neutrally in environments where there are no abnorm...
research
06/14/2022

Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction

In this work, we explore a novel few-shot personalisation architecture f...
research
06/29/2017

Speaker Identification Investigation and Analysis in Unbiased and Biased Emotional Talking Environments

This work aims at investigating and analyzing speaker identification in ...
research
02/11/2021

CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions

This work aims at intensifying text-independent speaker identification p...

Please sign up or login with your details

Forgot password? Click here to reset