Fast and Accurate OOV Decoder on High-Level Features

07/19/2017
by   Yuri Khokhlov, et al.
0

This work proposes a novel approach to out-of-vocabulary (OOV) keyword search (KWS) task. The proposed approach is based on using high-level features from an automatic speech recognition (ASR) system, so called phoneme posterior based (PPB) features, for decoding. These features are obtained by calculating time-dependent phoneme posterior probabilities from word lattices, followed by their smoothing. For the PPB features we developed a special novel very fast, simple and efficient OOV decoder. Experimental results are presented on the Georgian language from the IARPA Babel Program, which was the test language in the OpenKWS 2016 evaluation campaign. The results show that in terms of maximum term weighted value (MTWV) metric and computational speed, for single ASR systems, the proposed approach significantly outperforms the state-of-the-art approach based on using in-vocabulary proxies for OOV keywords in the indexed database. The comparison of the two OOV KWS approaches on the fusion results of the nine different ASR systems demonstrates that the proposed OOV decoder outperforms the proxy-based approach in terms of MTWV metric given the comparable processing speed. Other important advantages of the OOV decoder include extremely low memory consumption and simplicity of its implementation and parameter optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2021

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates

The multi-decoder (MD) end-to-end speech translation model has demonstra...
research
06/22/2017

Automatic Quality Estimation for ASR System Combination

Recognizer Output Voting Error Reduction (ROVER) has been widely used fo...
research
09/18/2019

Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

We present Espresso, an open-source, modular, extensible end-to-end neur...
research
08/07/2019

Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging

In recent years, studies on automatic speech recognition (ASR) have show...
research
03/03/2020

Improving Uyghur ASR systems with decoders using morpheme-based language models

Uyghur is a minority language, and its resources for Automatic Speech Re...
research
05/21/2023

Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems

End-to-end (e2e) systems have recently gained wide popularity in automat...

Please sign up or login with your details

Forgot password? Click here to reset