Improving the Robustness of Speech Translation

11/02/2018
by   Xiang Li, et al.
0

Although neural machine translation (NMT) has achieved impressive progress recently, it is usually trained on the clean parallel data set and hence cannot work well when the input sentence is the production of the automatic speech recognition (ASR) system due to the enormous errors in the source. To solve this problem, we propose a simple but effective method to improve the robustness of NMT in the case of speech translation. We simulate the noise existing in the realistic output of the ASR system and inject them into the clean parallel data so that NMT can work under similar word distributions during training and testing. Besides, we also incorporate the Chinese Pinyin feature which is easy to get in speech translation to further improve the translation performance. Experiment results show that our method has a more stable performance and outperforms the baseline by an average of 3.12 BLEU on multiple noisy test sets, even while achieves a generalization improvement on the WMT'17 Chinese-English test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2019

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Neural machine translation models have shown to achieve high quality whe...
research
09/25/2019

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

In a pipeline speech translation system, automatic speech recognition (A...
research
10/21/2020

Sentence Boundary Augmentation For Neural Machine Translation Robustness

Neural Machine Translation (NMT) models have demonstrated strong state o...
research
05/08/2018

Improving Character-level Japanese-Chinese Neural Machine Translation with Radicals as an Additional Input Feature

In recent years, Neural Machine Translation (NMT) has been proven to get...
research
10/11/2022

Improving Robustness of Retrieval Augmented Translation via Shuffling of Suggestions

Several recent studies have reported dramatic performance improvements i...
research
10/08/2021

Aura: Privacy-preserving augmentation to improve test set diversity in noise suppression applications

Noise suppression models running in production environments are commonly...
research
07/24/2021

The USYD-JD Speech Translation System for IWSLT 2021

This paper describes the University of Sydney JD's joint submission o...

Please sign up or login with your details

Forgot password? Click here to reset