Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition

08/28/2023
by   Zhisheng Zheng, et al.
0

In recent years, speech-based self-supervised learning (SSL) has made significant progress in various tasks, including automatic speech recognition (ASR). An ASR model with decent performance can be realized by fine-tuning an SSL model with a small fraction of labeled data. Reducing the demand for labeled data is always of great practical value. In this paper, we further extend the use of SSL to cut down labeling costs with active learning. Three types of units on different granularities are derived from speech signals in an unsupervised way, and their effects are compared by applying a contrastive data selection method. The experimental results show that our proposed data selection framework can effectively improve the word error rate (WER) by more than 11 maintaining the same WER, compared to random selection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2022

Towards Representative Subset Selection for Self-Supervised Speech Recognition

Self-supervised speech recognition models require considerable labeled t...
research
06/19/2020

Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization

The cost of labeling transcriptions for large speech corpora becomes a b...
research
12/03/2022

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models

Self-supervised learning (SSL) has been able to leverage unlabeled data ...
research
11/20/2009

Likelihood-based semi-supervised model selection with applications to speech processing

In conventional supervised pattern recognition tasks, model selection is...
research
05/24/2021

Unsupervised Speech Recognition

Despite rapid progress in the recent past, current speech recognition sy...
research
03/07/2019

Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models

The goal of this paper is to simulate the benefits of jointly applying a...
research
07/25/2022

Unsupervised data selection for Speech Recognition with contrastive loss ratios

This paper proposes an unsupervised data selection method by using a sub...

Please sign up or login with your details

Forgot password? Click here to reset