User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

08/02/2021
by   Hoang Long Nguyen, et al.
0

Recognition errors are common in human communication. Similar errors often lead to unwanted behaviour in dialogue systems or virtual assistants. In human communication, we can recover from them by repeating misrecognized words or phrases; however in human-machine communication this recovery mechanism is not available. In this paper, we attempt to bridge this gap and present a system that allows a user to correct speech recognition errors in a virtual assistant by repeating misunderstood words. When a user repeats part of the phrase the system rewrites the original query to incorporate the correction. This rewrite allows the virtual assistant to understand the original query successfully. We present an end-to-end 2-step attention pointer network that can generate the the rewritten query by merging together the incorrectly understood utterance with the correction follow-up. We evaluate the model on data collected for this task and compare the proposed model to a rule-based baseline and a standard pointer network. We show that rewriting the original query is an effective way to handle repetition-based recovery and that the proposed model outperforms the rule based baseline, reducing Word Error Rate by 19 Rate on annotated data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2019

Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition

Connectionist Temporal Classification (CTC) based end-to-end speech reco...
research
06/06/2017

A Frame Tracking Model for Memory-Enhanced Dialogue Systems

Recently, resources and tasks were proposed to go beyond state tracking ...
research
03/22/2022

Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue

Context modeling plays a significant role in building multi-turn dialogu...
research
10/28/2022

Improving short-video speech recognition using random utterance concatenation

One of the limitations in end-to-end automatic speech recognition framew...
research
02/12/2021

Hybrid phonetic-neural model for correction in speech recognition systems

Automatic speech recognition (ASR) is a relevant area in multiple settin...
research
02/22/2023

Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation

We previously proposed contextual spelling correction (CSC) to correct t...
research
03/29/2022

Seq-2-Seq based Refinement of ASR Output for Spoken Name Capture

Person name capture from human speech is a difficult task in human-machi...

Please sign up or login with your details

Forgot password? Click here to reset