Selector-Enhancer: Learning Dynamic Selection of Local and Non-local Attention Operation for Speech Enhancement

by   Xinmeng Xu, et al.

Attention mechanisms, such as local and non-local attention, play a fundamental role in recent deep learning based speech enhancement (SE) systems. However, natural speech contains many fast-changing and relatively brief acoustic events, therefore, capturing the most informative speech features by indiscriminately using local and non-local attention is challenged. We observe that the noise type and speech feature vary within a sequence of speech and the local and non-local operations can respectively extract different features from corrupted speech. To leverage this, we propose Selector-Enhancer, a dual-attention based convolution neural network (CNN) with a feature-filter that can dynamically select regions from low-resolution speech features and feed them to local or non-local attention operations. In particular, the proposed feature-filter is trained by using reinforcement learning (RL) with a developed difficulty-regulated reward that is related to network performance, model complexity, and "the difficulty of the SE task". The results show that our method achieves comparable or superior performance to existing approaches. In particular, Selector-Enhancer is potentially effective for real-world denoising, where the number and types of noise are varies on a single noisy mixture.


page 1

page 3

page 5


PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Convolutional neural networks (CNN) and Transformer have wildly succeede...

Speech Enhancement with Multi-granularity Vector Quantization

With advances in deep learning, neural network based speech enhancement ...

Full Attention Bidirectional Deep Learning Structure for Single Channel Speech Enhancement

As the cornerstone of other important technologies, such as speech recog...

3D Axial-Attention for Lung Nodule Classification

Purpose: In recent years, Non-Local based methods have been successfully...

Dynamic Switching Networks: A Dynamic, Non-local, and Time-independent Approach to Emergence

The concept of emergence is a powerful concept to explain very complex b...

Path-Restore: Learning Network Path Selection for Image Restoration

Very deep Convolutional Neural Networks (CNNs) have greatly improved the...

Please sign up or login with your details

Forgot password? Click here to reset