Investigating Cross-Domain Losses for Speech Enhancement

10/20/2020
by   Sherif Abdulatif, et al.
0

Recent years have seen a surge in the number of available frameworks for speech enhancement (SE) and recognition. Whether model-based or constructed via deep learning, these frameworks often rely in isolation on either time-domain signals or time-frequency (TF) representations of speech data. In this study, we investigate the advantages of each set of approaches by separately examining their impact on speech intelligibility and quality. Furthermore, we combine the fragmented benefits of time-domain and TF speech representations by introducing two new cross-domain SE frameworks. A quantitative comparative analysis against recent model-based and deep learning SE approaches is performed to illustrate the merit of the proposed frameworks.

READ FULL TEXT
research
08/26/2021

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

In recent decades, many studies have suggested that phase information is...
research
11/03/2021

Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features

In this study, we propose a cross-domain multi-objective speech assessme...
research
03/28/2022

CMGAN: Conformer-based Metric GAN for Speech Enhancement

Recently, convolution-augmented transformer (Conformer) has achieved pro...
research
07/19/2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

This paper presents recent progress on integrating speech separation and...
research
09/26/2019

Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks

In recent years, waveform-mapping-based speech enhancement (SE) methods ...
research
10/28/2020

Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement

Speech enhancement (SE) aims to improve speech quality and intelligibili...
research
02/02/2023

Speech Enhancement for Virtual Meetings on Cellular Networks

We study speech enhancement using deep learning (DL) for virtual meeting...

Please sign up or login with your details

Forgot password? Click here to reset