Using Optimal Ratio Mask as Training Target for Supervised Speech Separation

09/04/2017
by   Shasha Xia, et al.
0

Supervised speech separation uses supervised learning algorithms to learn a mapping from an input noisy signal to an output target. With the fast development of deep learning, supervised separation has become the most important direction in speech separation area in recent years. For the supervised algorithm, training target has a significant impact on the performance. Ideal ratio mask is a commonly used training target, which can improve the speech intelligibility and quality of the separated speech. However, it does not take into account the correlation between noise and clean speech. In this paper, we use the optimal ratio mask as the training target of the deep neural network (DNN) for speech separation. The experiments are carried out under various noise environments and signal to noise ratio (SNR) conditions. The results show that the optimal ratio mask outperforms other training targets in general.

READ FULL TEXT
research
08/24/2017

Supervised Speech Separation Based on Deep Learning: An Overview

Speech separation is the task of separating target speech from backgroun...
research
09/21/2023

Is the Ideal Ratio Mask Really the Best? – Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers

This study investigates mask-based beamformers (BFs), which estimate fil...
research
04/16/2020

Deep Neural Network (DNN) for Water/Fat Separation: Supervised Training, Unsupervised Training, and No Training

Purpose: To use a deep neural network (DNN) for solving the optimization...
research
04/26/2019

Performance modeling of electro-optical devices for military target acquisition

Accurate predictions of electro-optical imager performance are important...
research
04/14/2020

Two-stage model and optimal SI-SNR for monaural multi-speaker speech separation in noisy environment

In daily listening environments, speech is always distorted by backgroun...
research
04/26/2021

Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain

To date, mainstream target speech separation (TSS) approaches are formul...
research
04/14/2023

On Data Sampling Strategies for Training Neural Network Speech Separation Models

Speech separation remains an important area of multi-speaker signal proc...

Please sign up or login with your details

Forgot password? Click here to reset