DPT-FSNet:Dual-path Transformer Based Full-band and Sub-band Fusion Network for Speech Enhancement

04/27/2021
by   Feng Dang, et al.
0

Recently, dual-path networks have achieved promising performance due to their ability to model local and global features of the input sequence. However, previous studies are based on simple time-domain features and do not fully investigate the impact of the input features of the dual-path network on the enhancement performance. In this paper, we propose a dual-path transformer-based full-band and sub-band fusion network (DPT-FSNet) for speech enhancement in the frequency domain. The intra and inter parts of the dual-path transformer network in our model can be seen as sub-band and full-band modeling respectively, which have stronger interpretability as well as more information compared to the features utilized by the time-domain transformer. We conducted experiments on the Voice Bank + DEMAND dataset to evaluate the proposed method. Experimental results show that the proposed method outperforms the current state-of-the-arts in terms of PESQ, STOI, CSIG, COVL. (The PESQ, STOI, CSIG, and COVL scores on the Voice Bank + DEMAND dataset were 3.30, 0.95, 4.51, and 3.94, respectively).

READ FULL TEXT
research
10/29/2020

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement

This paper proposes a full-band and sub-band fusion model, named as Full...
research
09/24/2022

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

To address the monaural speech enhancement problem, numerous research st...
research
03/01/2022

DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement

Full-band speech enhancement based on deep neural networks is still chal...
research
01/19/2023

THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement

In this paper, we propose a two-stage heterogeneous lightweight network ...
research
07/12/2021

DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement

The dual-path RNN (DPRNN) was proposed to more effectively model extreme...
research
12/11/2021

U-shaped Transformer with Frequency-Band Aware Attention for Speech Enhancement

The state-of-the-art speech enhancement has limited performance in speec...
research
12/14/2022

Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation

Recently studies on time-domain audio separation networks (TasNets) have...

Please sign up or login with your details

Forgot password? Click here to reset