Incremental Learning for End-to-End Automatic Speech Recognition

05/11/2020
by   Li Fu, et al.
0

We propose an incremental learning for end-to-end Automatic Speech Recognition (ASR) to extend the model's capacity on a new task while retaining the performance on existing ones. The proposed method is effective without accessing to the old dataset to address the issues of high training cost and old dataset unavailability. To achieve this, knowledge distillation is applied as a guidance to retain the recognition ability from the previous model, which is then combined with the new ASR task for model optimization. With an ASR model pre-trained on 12,000h Mandarin speech, we test our proposed method on 300h new scenario task and 1h new named entities task. Experiments show that our method yields 3.25 on the new scenario, when compared with the pre-trained model and the full-data retraining baseline, respectively. It even yields a surprising 0.37 CER reduction on the new scenario than the fine-tuning. For the new named entities task, our method significantly improves the accuracy compared with the pre-trained model, i.e. 16.95 adaptions, the new models still maintain a same accuracy with the baseline on the old tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition

In this paper, we propose a language-universal adapter learning framewor...
research
10/11/2021

K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables

Wav2vec 2.0 is an end-to-end framework of self-supervised learning for s...
research
10/27/2022

Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition

Adapting a trained Automatic Speech Recognition (ASR) model to new tasks...
research
08/24/2023

A Small and Fast BERT for Chinese Medical Punctuation Restoration

In clinical dictation, utterances after automatic speech recognition (AS...
research
06/01/2023

Some voices are too common: Building fair speech recognition systems using the Common Voice dataset

Automatic speech recognition (ASR) systems become increasingly efficient...
research
07/23/2022

Augmented Bilinear Network for Incremental Multi-Stock Time-Series Classification

Deep Learning models have become dominant in tackling financial time-ser...
research
06/01/2023

AfriNames: Most ASR models "butcher" African Names

Useful conversational agents must accurately capture named entities to m...

Please sign up or login with your details

Forgot password? Click here to reset