On the Copying Behaviors of Pre-Training for Neural Machine Translation

07/17/2021
by   Xuebo Liu, et al.
0

Previous studies have shown that initializing neural machine translation (NMT) models with the pre-trained language models (LM) can speed up the model training and boost the model performance. In this work, we identify a critical side-effect of pre-training for NMT, which is due to the discrepancy between the training objectives of LM-based pre-training and NMT. Since the LM objective learns to reconstruct a few source tokens and copy most of them, the pre-training initialization would affect the copying behaviors of NMT models. We provide a quantitative analysis of copying behaviors by introducing a metric called copying ratio, which empirically shows that pre-training based NMT models have a larger copying ratio than the standard one. In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding. Extensive experiments on both in-domain and out-of-domain benchmarks show that the copying penalty method consistently improves translation performance by controlling copying behaviors for pre-training based NMT models. Source code is freely available at https://github.com/SunbowLiu/CopyingPenalty.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

Pre-training (PT) and back-translation (BT) are two simple and powerful ...
research
02/28/2020

Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation

Back-translation provides a simple yet effective approach to exploit mon...
research
08/25/2021

YANMTT: Yet Another Neural Machine Translation Toolkit

In this paper we present our open-source neural machine translation (NMT...
research
09/07/2022

On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation

Pre-Training (PT) of text representations has been successfully applied ...
research
12/19/2022

Synthetic Pre-Training Tasks for Neural Machine Translation

Pre-training is an effective technique for ensuring robust performance o...
research
10/24/2022

Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks

Memorization presents a challenge for several constrained Natural Langua...
research
06/21/2023

Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints

Lexically-constrained NMT (LNMT) aims to incorporate user-provided termi...

Please sign up or login with your details

Forgot password? Click here to reset