Multi-resolution location-based training for multi-channel continuous speech separation

01/16/2023
by   Hassan Taherian, et al.
0

The performance of automatic speech recognition (ASR) systems severely degrades when multi-talker speech overlap occurs. In meeting environments, speech separation is typically performed to improve the robustness of ASR systems. Recently, location-based training (LBT) was proposed as a new training criterion for multi-channel talker-independent speaker separation. Assuming fixed array geometry, LBT outperforms widely-used permutation-invariant training in fully overlapped utterances and matched reverberant conditions. This paper extends LBT to conversational multi-channel speaker separation. We introduce multi-resolution LBT to estimate the complex spectrograms from low to high time and frequency resolutions. With multi-resolution LBT, convolutional kernels are assigned consistently based on speaker locations in physical space. Evaluation results show that multi-resolution LBT consistently outperforms other competitive methods on the recorded LibriCSS corpus.

READ FULL TEXT
research
10/20/2020

Speaker Separation Using Speaker Inventories and Estimated Speech

We propose speaker separation using speaker inventories and estimated sp...
research
01/30/2020

Continuous speech separation: dataset and analysis

This paper describes a dataset and protocols for evaluating continuous s...
research
12/10/2021

Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech

Many of the recent advances in speech separation are primarily aimed at ...
research
07/19/2017

Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training

Although great progresses have been made in automatic speech recognition...
research
10/04/2020

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speaker Separation

We propose multi-microphone complex spectral mapping, a simple way of ap...
research
10/12/2021

VarArray: Array-Geometry-Agnostic Continuous Speech Separation

Continuous speech separation using a microphone array was shown to be pr...
research
03/30/2022

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

Because the performance of speech separation is excellent for speech in ...

Please sign up or login with your details

Forgot password? Click here to reset