Automatic Speech Recognition (ASR) models need to be optimized for speci...
Large language models have proven themselves highly flexible, able to so...
This paper presents a method for selecting appropriate synthetic speech
...
State space models (SSMs) have recently shown promising results on
small...
Wake word detection exists in most intelligent homes and portable device...
There is growing interest in unifying the streaming and full-context
aut...
Cross-device federated learning (FL) protects user privacy by collaborat...
From wearables to powerful smart devices, modern automatic speech recogn...
This paper improves the streaming transformer transducer for speech
reco...
Automatic speech recognition (ASR) has become increasingly ubiquitous on...
Often, the storage and computational constraints of embeddeddevices dema...
As speech-enabled devices such as smartphones and smart speakers become
...
How to leverage dynamic contextual information in end-to-end speech
reco...
Recurrent transducer models have emerged as a promising solution for spe...
Knowledge Distillation is an effective method of transferring knowledge ...
There is a growing interest in the speech community in developing Recurr...
Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech...
The demand for fast and accurate incremental speech recognition increase...
Thus far, end-to-end (E2E) models have not been shown to outperform
stat...
While most deployed speech recognition systems today still run on server...
End-to-end (E2E) models, which directly predict output character sequenc...