Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables

07/24/2023
by   Ali Araabi, et al.
0

Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i.e., generalization. In this paper, we propose a method called Joint Dropout, that addresses the challenge of low-resource neural machine translation by substituting phrases with variables, resulting in significant enhancement of compositionality, which is a key aspect of generalization. We observe a substantial improvement in translation quality for language pairs with minimal resources, as seen in BLEU and Direct Assessment scores. Furthermore, we conduct an error analysis, and find Joint Dropout to also enhance generalizability of low-resource NMT in terms of robustness and adaptability across different domains

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2016

Transfer Learning for Low-Resource Neural Machine Translation

The encoder-decoder framework for neural machine translation (NMT) has b...
research
06/29/2021

Neural Machine Translation for Low-Resource Languages: A Survey

Neural Machine Translation (NMT) has seen a tremendous spurt of growth i...
research
10/05/2021

Sicilian Translator: A Recipe for Low-Resource NMT

With 17,000 pairs of Sicilian-English translated sentences, Arba Sicula ...
research
05/02/2017

A Teacher-Student Framework for Zero-Resource Neural Machine Translation

While end-to-end neural machine translation (NMT) has made remarkable pr...
research
11/12/2021

BitextEdit: Automatic Bitext Editing for Improved Low-Resource Machine Translation

Mined bitexts can contain imperfect translations that yield unreliable t...
research
11/02/2018

Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation

We aim to better exploit the limited amounts of parallel text available ...
research
02/28/2022

LCP-dropout: Compression-based Multiple Subword Segmentation for Neural Machine Translation

In this study, we propose a simple and effective preprocessing method fo...

Please sign up or login with your details

Forgot password? Click here to reset