An Analysis of Personalized Speech Recognition System Development for the Deaf and Hard-of-Hearing

06/24/2023
by   Lester Phillip Violeta, et al.
0

Deaf or hard-of-hearing (DHH) speakers typically have atypical speech caused by deafness. With the growing support of speech-based devices and software applications, more work needs to be done to make these devices inclusive to everyone. To do so, we analyze the use of openly-available automatic speech recognition (ASR) tools with a DHH Japanese speaker dataset. As these out-of-the-box ASR models typically do not perform well on DHH speech, we provide a thorough analysis of creating personalized ASR systems. We collected a large DHH speaker dataset of four speakers totaling around 28.05 hours and thoroughly analyzed the performance of different training frameworks by varying the training data sizes. Our findings show that 1000 utterances (or 1-2 hours) from a target speaker can already significantly improve the model performance with minimal amount of work needed, thus we recommend researchers to collect at least 1000 utterances to make an efficient personalized ASR system. In cases where 1000 utterances is difficult to collect, we also discover significant improvements in using previously proposed data augmentation techniques such as intermediate fine-tuning when only 200 utterances are available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2022

An analysis of degenerating speech due to progressive dysarthria on ASR performance

Although personalized automatic speech recognition (ASR) models have rec...
research
05/08/2019

On the representation of speech and music

In most automatic speech recognition (ASR) systems, the audio signal is ...
research
03/04/2021

Error-driven Fixed-Budget ASR Personalization for Accented Speakers

We consider the task of personalizing ASR models while being constrained...
research
05/16/2019

Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

The rapid population aging has stimulated the development of assistive d...
research
06/28/2023

Cascaded encoders for fine-tuning ASR models on overlapped speech

Multi-talker speech recognition (MT-ASR) has been shown to improve ASR p...
research
05/05/2022

Speaker Recognition in the Wild

In this paper, we propose a pipeline to find the number of speakers, as ...
research
09/07/2022

Modeling Dependent Structure for Utterances in ASR Evaluation

The bootstrap resampling method has been popular for performing signific...

Please sign up or login with your details

Forgot password? Click here to reset