Training state-of-the-art ASR systems such as RNN-T often has a high
ass...
Knowledge distillation is a technique where the outputs of a pretrained
...
Data subset selection from a large number of training instances has been...
Large scale machine learning and deep models are extremely data-hungry.
...