NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

by   Chi-Chang Lee, et al.

For deep learning-based speech enhancement (SE) systems, the training-test acoustic mismatch can cause notable performance degradation. To address the mismatch issue, numerous noise adaptation strategies have been derived. In this paper, we propose a novel method, called noise adaptive speech enhancement with target-conditional resampling (NASTAR), which reduces mismatches with only one sample (one-shot) of noisy speech in the target environment. NASTAR uses a feedback mechanism to simulate adaptive training data via a noise extractor and a retrieval model. The noise extractor estimates the target noise from the noisy speech, called pseudo-noise. The noise retrieval model retrieves relevant noise samples from a pool of noise signals according to the noisy speech, called relevant-cohort. The pseudo-noise and the relevant-cohort set are jointly sampled and mixed with the source speech corpus to prepare simulated training data for noise adaptation. Experimental results show that NASTAR can effectively use one noisy speech sample to adapt an SE model to a target condition. Moreover, both the noise extractor and the noise retrieval model contribute to model adaptation. To our best knowledge, NASTAR is the first work to perform one-shot noise adaptation through noise extraction and retrieval.


page 1

page 2

page 3

page 4


Noise Adaptive Speech Enhancement using Domain Adversarial Training

In this study, we propose a novel noise adaptive speech enhancement (SE)...

SERIL: Noise Adaptive Speech Enhancement using Regularization-based Incremental Learning

Numerous noise adaptation techniques have been proposed to address the m...

Inference and Denoise: Causal Inference-based Neural Speech Enhancement

This study addresses the speech enhancement (SE) task within the causal ...

OSSEM: one-shot speaker adaptive speech enhancement using meta learning

Although deep learning (DL) has achieved notable progress in speech enha...

A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference

The lack of clean speech is a practical challenge to the development of ...

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

We present RemixIT, a simple yet effective self-supervised method for tr...

Incorporating Symbolic Sequential Modeling for Speech Enhancement

In a noisy environment, a lossy speech signal can be automatically resto...

Please sign up or login with your details

Forgot password? Click here to reset