End-to-end Recurrent Denoising Autoencoder Embeddings for Speaker Identification

Speech 'in-the-wild' is a handicap for speaker recognition systems due to the variability induced by real-life conditions, such as environmental noise and emotions in the speaker. Taking advantage of representation learning, on this paper we aim to design a recurrent denoising autoencoder architecture that extracts robust low-dimensional representations –speaker embeddings– from noisy spectrograms to perform speaker identification. The end-to-end proposed architecture uses a feedback loop to encode information regarding to the speaker into a spectrogram denoising autoencoder. We make use of data augmentation techniques to corrupt clean speech with additive real-life environmental noise and utilize a database with real stressed speech. The proposed architecture benefits from the time sequences and frequency patterns present in the spectrograms that inherently represent the speaker, outperforming other architectures compared by using state-of-the-art speaker embeddings.

READ FULL TEXT
research
11/07/2018

On the use of DNN Autoencoder for Robust Speaker Recognition

In this paper, we present an analysis of a DNN-based autoencoder for spe...
research
10/03/2021

PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction

Speech enhancement aims to improve the perceptual quality of the speech ...
research
07/20/2023

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

Background noise reduces speech intelligibility and quality, making spea...
research
02/20/2020

Wavesplit: End-to-End Speech Separation by Speaker Clustering

We introduce Wavesplit, an end-to-end speech separation system. From a s...
research
11/19/2018

Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

In this work, we present an analysis of a DNN-based autoencoder for spee...
research
11/15/2015

Learning Representations of Affect from Speech

There has been a lot of prior work on representation learning for speech...
research
06/23/2022

Speaker-Independent Microphone Identification in Noisy Conditions

This work proposes a method for source device identification from speech...

Please sign up or login with your details

Forgot password? Click here to reset