A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning

02/11/2022
by   Tassadaq Hussain, et al.
4

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, such approaches do not deliver required levels of speech intelligibility in everyday noisy environments . Intelligibility-oriented (I-O) loss functions have recently been developed to train DL approaches for robust speech enhancement. Here, we formulate, for the first time, a novel canonical correlation based I-O loss function to more effectively train DL algorithms. Specifically, we present a canonical-correlation based short-time objective intelligibility (CC-STOI) cost function to train a fully convolutional neural network (FCN) model. We carry out comparative simulation experiments to show that our CC-STOI based speech enhancement framework outperforms state-of-the-art DL models trained with conventional distance-based and STOI-based loss functions, using objective and subjective evaluation measures for case of both unseen speakers and noises. Ongoing future work is evaluating the proposed approach for design of robust hearing-assistive technology.

READ FULL TEXT
research
02/08/2022

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

Current deep learning (DL) based approaches to speech intelligibility en...
research
11/18/2021

Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Existing deep learning (DL) based speech enhancement approaches are gene...
research
01/11/2023

Perceive and predict: self-supervised speech representation based loss functions for speech enhancement

Recent work in the domain of speech enhancement has explored the use of ...
research
01/28/2020

Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

This paper investigates several aspects of training a RNN (recurrent neu...
research
08/20/2017

Perceptual audio loss function for deep learning

PESQ and POLQA , are standards are standards for automated assessment of...
research
09/12/2017

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

Speech enhancement model is used to map a noisy speech to a clean speech...

Please sign up or login with your details

Forgot password? Click here to reset