Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German

10/30/2017
by   Pierre-Edouard Honnet, et al.
0

The goal of this work is to design a machine translation system for a low-resource family of dialects, collectively known as Swiss German. We list the parallel resources that we collected, and present three strategies for normalizing Swiss German input in order to address the regional and spelling diversity. We show that character-based neural MT is the best solution for text normalization and that in combination with phrase-based statistical MT we reach 36 dialect becomes more remote from the training one.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Manipulating training data leads to robust neural models for MT....
research
03/31/2021

Domain-specific MT for Low-resource Languages: The case of Bambara-French

Translating to and from low-resource languages is a challenge for machin...
research
05/01/2020

Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation

Machine translation (MT) has benefited from using synthetic training dat...
research
10/22/2020

CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20

This paper presents a description of CUNI systems submitted to the WMT20...
research
02/24/2019

The ARIEL-CMU Systems for LoReHLT18

This paper describes the ARIEL-CMU submissions to the Low Resource Human...
research
03/02/2023

Letz Translate: Low-Resource Machine Translation for Luxembourgish

Natural language processing of Low-Resource Languages (LRL) is often cha...
research
03/20/2022

Small Batch Sizes Improve Training of Low-Resource Neural MT

We study the role of an essential hyper-parameter that governs the train...

Please sign up or login with your details

Forgot password? Click here to reset