Low Resource German ASR with Untranscribed Data Spoken by Non-native Children – INTERSPEECH 2021 Shared Task SPAPL System

06/18/2021
by   Jinhan Wang, et al.
0

This paper describes the SPAPL system for the INTERSPEECH 2021 Challenge: Shared Task on Automatic Speech Recognition for Non-Native Children's Speech in German.   5 hours of transcribed data and   60 hours of untranscribed data are provided to develop a German ASR system for children. For the training of the transcribed data, we propose a non-speech state discriminative loss (NSDL) to mitigate the influence of long-duration non-speech segments within speech utterances. In order to explore the use of the untranscribed data, various approaches are implemented and combined together to incrementally improve the system performance. First, bidirectional autoregressive predictive coding (Bi-APC) is used to learn initial parameters for acoustic modelling using the provided untranscribed data. Second, incremental semi-supervised learning is further used to iteratively generate pseudo-transcribed data. Third, different data augmentation schemes are used at different training stages to increase the variability and size of the training data. Finally, a recurrent neural network language model (RNNLM) is used for rescoring. Our system achieves a word error rate (WER) of 39.68 improvement over the official baseline (45.21

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2020

The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge

This paper describes the NTNU ASR system participating in the Interspeec...
research
08/29/2020

Data augmentation using prosody and false starts to recognize non-native children's speech

This paper describes AaltoASR's speech recognition system for the INTERS...
research
09/06/2019

Neural Network-Based Modeling of Phonetic Durations

A deep neural network (DNN)-based model has been developed to predict no...
research
03/15/2019

Automatic assessment of spoken language proficiency of non-native children

This paper describes technology developed to automatically grade Italian...
research
01/22/2020

TLT-school: a Corpus of Non Native Children Speech

This paper describes "TLT-school" a corpus of speech utterances collecte...
research
02/24/2022

Towards Better Meta-Initialization with Task Augmentation for Kindergarten-aged Speech Recognition

Children's automatic speech recognition (ASR) is always difficult due to...
research
06/15/2021

Modeling morphology with Linear Discriminative Learning: considerations and design choices

This study addresses a series of methodological questions that arise whe...

Please sign up or login with your details

Forgot password? Click here to reset