The LIG system for the English-Czech Text Translation Task of IWSLT 2019

by   Loïc Vial, et al.

In this paper, we present our submission for the English to Czech Text Translation Task of IWSLT 2019. Our system aims to study how pre-trained language models, used as input embeddings, can improve a specialized machine translation system trained on few data. Therefore, we implemented a Transformer-based encoder-decoder neural system which is able to use the output of a pre-trained language model as input embeddings, and we compared its performance under three configurations: 1) without any pre-trained language model (constrained), 2) using a language model trained on the monolingual parts of the allowed English-Czech data (constrained), and 3) using a language model trained on a large quantity of external monolingual data (unconstrained). We used BERT as external pre-trained language model (configuration 3), and BERT architecture for training our own language model (configuration 2). Regarding the training data, we trained our MT system on a small quantity of parallel text: one set only consists of the provided MuST-C corpus, and the other set consists of the MuST-C corpus and the News Commentary corpus from WMT. We observed that using the external pre-trained BERT improves the scores of our system by +0.8 to +1.5 of BLEU on our development set, and +0.97 to +1.94 of BLEU on the test set. However, using our own language model trained only on the allowed parallel data seems to improve the machine translation performances only when the system is trained on the smallest dataset.


page 1

page 2

page 3

page 4


Multilingual Translation via Grafting Pre-trained Language Models

Can pre-trained BERT for one language and GPT for another be glued toget...

Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction

Our entry into the HAHA 2019 Challenge placed 3^rd in the classification...

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

The success of bidirectional encoders using masked language models, such...

HateBERT: Retraining BERT for Abusive Language Detection in English

In this paper, we introduce HateBERT, a re-trained BERT model for abusiv...

Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing

We describe models focused at the understudied problem of translating be...

EAT2seq: A generic framework for controlled sentence transformation without task-specific training

We present EAT2seq: a novel method to architect automatic linguistic tra...

Code Switching Language Model Using Monolingual Training Data

Training a code-switching (CS) language model using only monolingual dat...

Please sign up or login with your details

Forgot password? Click here to reset