How multilingual is Multilingual BERT?

06/04/2019
by   Telmo Pires, et al.
0

In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, we present a large number of probing experiments, showing that transfer is possible even to languages in different scripts, that transfer works best between typologically similar languages, that monolingual corpora can train models for code-switching, and that the model can find translation pairs. From these results, we can conclude that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/19/2020

Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking

Cross-language entity linking grounds mentions in multiple languages to ...
research
03/03/2023

Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM

The NLP community recently saw the release of a new large open-access mu...
research
03/19/2021

MuRIL: Multilingual Representations for Indian Languages

India is a multilingual society with 1369 rationalized languages and dia...
research
11/10/2020

To What Degree Can Language Borders Be Blurred In BERT-based Multilingual Spoken Language Understanding?

This paper addresses the question as to what degree a BERT-based multili...
research
06/09/2021

Probing Multilingual Language Models for Discourse

Pre-trained multilingual language models have become an important buildi...
research
02/25/2020

BERT Can See Out of the Box: On the Cross-modal Transferability of Text Representations

Pre-trained language models such as BERT have recently contributed to si...
research
05/06/2021

Adapting Monolingual Models: Data can be Scarce when Language Similarity is High

For many (minority) languages, the resources needed to train large model...

Please sign up or login with your details

Forgot password? Click here to reset