Efficient Segmental Cascades for Speech Recognition

08/02/2016
by   Hao Tang, et al.
0

Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition. However, their appeal has been limited by their computational requirements, due to the large number of possible segments to consider. Multi-pass cascades of segmental models introduce features of increasing complexity in different passes, where in each pass a segmental model rescores lattices produced by a previous (simpler) segmental model. In this paper, we explore several ways of making segmental cascades efficient and practical: reducing the feature set in the first pass, frame subsampling, and various pruning approaches. In experiments on phonetic recognition, we find that with a combination of such techniques, it is possible to maintain competitive performance while greatly reducing decoding, pruning, and training time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2021

An Asynchronous WFST-Based Decoder For Automatic Speech Recognition

We introduce asynchronous dynamic decoder, which adopts an efficient A* ...
research
07/22/2015

Discriminative Segmental Cascades for Feature-Rich Phone Recognition

Discriminative segmental models, such as segmental conditional random fi...
research
11/02/2022

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

This work studies the use of attention masking in transformer transducer...
research
08/12/2014

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs

We present a method to perform first-pass large vocabulary continuous sp...
research
10/10/2021

Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition

Hybrid and end-to-end (E2E) systems have their individual advantages, wi...
research
10/19/2017

Combining Multiple Views for Visual Speech Recognition

Visual speech recognition is a challenging research problem with a parti...
research
09/05/2017

Sequence Prediction with Neural Segmental Models

Segments that span contiguous parts of inputs, such as phonemes in speec...

Please sign up or login with your details

Forgot password? Click here to reset