Cross-neutralising: Probing for joint encoding of linguistic information in multilingual models

10/24/2020
by   Rochelle Choenni, et al.
0

Multilingual sentence encoders are widely used to transfer NLP models across languages. The success of this transfer is, however, dependent on the model's ability to encode the patterns of cross-lingual similarity and variation. Yet, little is known as to how these models are able to do this. We propose a simple method to study how relationships between languages are encoded in two state-of-the-art multilingual models (i.e. M-BERT and XLM-R). The results provide insight into their information sharing mechanisms and suggest that linguistic properties are encoded jointly across typologically-similar languages in these models.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset