Vega-MT: The JD Explore Academy Translation System for WMT22

09/20/2022
by   Changtong Zan, et al.
2

We describe the JD Explore Academy's submission of the WMT 2022 shared general translation task. We participated in all high-resource tracks and one medium-resource track, including Chinese-English, German-English, Czech-English, Russian-English, and Japanese-English. We push the limit of our previous work – bidirectional training for translation by scaling up two main factors, i.e. language pairs and model sizes, namely the Vega-MT system. As for language pairs, we scale the "bidirectional" up to the "multidirectional" settings, covering all participating languages, to exploit the common knowledge across languages, and transfer them to the downstream bilingual tasks. As for model sizes, we scale the Transformer-Big up to the extremely large model that owns nearly 4.7 Billion parameters, to fully enhance the model capacity for our Vega-MT. Also, we adopt the data augmentation strategies, e.g. cycle translation for monolingual data, and bidirectional self-training for bilingual and monolingual data, to comprehensively exploit the bilingual and monolingual data. To adapt our Vega-MT to the general domain test set, generalization tuning is designed. Based on the official automatic scores of constrained systems, in terms of the sacreBLEU shown in Figure-1, we got the 1st place on Zh-En (33.5), En-Zh (49.7), De-En (33.7), En-De (37.8), Cs-En (54.9), En-Cs (41.4) and En-Ru (32.7), 2nd place on Ru-En (45.1) and Ja-En (25.6), and 3rd place on En-Ja(41.5), respectively; W.R.T the COMET, we got the 1st place on Zh-En (45.1), En-Zh (61.7), De-En (58.0), En-De (63.2), Cs-En (74.7), Ru-En (64.9), En-Ru (69.6) and En-Ja (65.1), 2nd place on En-Cs (95.3) and Ja-En (40.6), respectively. Models will be released to facilitate the MT community through GitHub and OmniForce Platform.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2019

Bilingual is At Least Monolingual (BALM): A Novel Translation Algorithm that Encodes Monolingual Priors

State-of-the-art machine translation (MT) models do not use knowledge of...
research
11/16/2020

Facebook AI's WMT20 News Translation Task Submission

This paper describes Facebook AI's submission to WMT20 shared news trans...
research
10/11/2022

Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text

Data sparsity is one of the main challenges posed by Code-switching (CS)...
research
06/10/2021

KARI: KAnari/QCRI's End-to-End systems for the INTERSPEECH 2021 Indian Languages Code-Switching Challenge

In this paper, we present the Kanari/QCRI (KARI) system and the modeling...
research
01/30/2023

Adaptive Machine Translation with Large Language Models

Consistency is a key requirement of high-quality translation. It is espe...
research
04/11/2022

End-to-End Speech Translation for Code Switched Speech

Code switching (CS) refers to the phenomenon of interchangeably using wo...
research
03/24/2023

Towards Making the Most of ChatGPT for Machine Translation

ChatGPT shows remarkable capabilities for machine translation (MT). Seve...

Please sign up or login with your details

Forgot password? Click here to reset