Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

12/04/2021
by   Xiaolin Hu, et al.
0

Recent advances in the design of neural network architectures, in particular those specialized in modeling sequences, have provided significant improvements in speech separation performance. In this work, we propose to use a bio-inspired architecture called Fully Recurrent Convolutional Neural Network (FRCNN) to solve the separation task. This model contains bottom-up, top-down and lateral connections to fuse information processed at various time-scales represented by stages. In contrast to the traditional approach updating stages in parallel, we propose to first update the stages one by one in the bottom-up direction, then fuse information from adjacent stages simultaneously and finally fuse information from all stages to the bottom stage together. Experiments showed that this asynchronous updating scheme achieved significantly better results with much fewer parameters than the traditional synchronous updating scheme. In addition, the proposed model achieved good balance between speech separation accuracy and computational efficiency as compared to other state-of-the-art models on three benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2022

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performanc...
research
07/28/2020

Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation

The dominant speech separation models are based on complex recurrent or ...
research
12/21/2022

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

Audio-visual approaches involving visual inputs have laid the foundation...
research
05/31/2023

Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model

We propose Audio-Visual Lightweight ITerative model (AVLIT), an effectiv...
research
06/09/2023

An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention

We present an efficient speech separation neural network, ARFDCN, which ...
research
09/30/2022

An efficient encoder-decoder architecture with top-down attention for speech separation

Deep neural networks have shown excellent prospects in speech separation...

Please sign up or login with your details

Forgot password? Click here to reset