DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer

03/04/2023
by   Shanu Kumar, et al.
0

Zero-shot cross-lingual transfer is promising, however has been shown to be sub-optimal, with inferior transfer performance across low-resource languages. In this work, we envision languages as domains for improving zero-shot transfer by jointly reducing the feature incongruity between the source and the target language and increasing the generalization capabilities of pre-trained multilingual transformers. We show that our approach, DiTTO, significantly outperforms the standard zero-shot fine-tuning method on multiple datasets across all languages using solely unlabeled instances in the target language. Empirical results show that jointly reducing feature incongruity for multiple target languages is vital for successful cross-lingual transfer. Moreover, our model enables better cross-lingual transfer than standard fine-tuning methods, even in the few-shot setting.

READ FULL TEXT

page 7

page 18

page 19

page 20

page 21

research
05/01/2020

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

Massively multilingual transformers pretrained with language modeling ob...
research
07/12/2022

Zero-shot Cross-lingual Transfer is Under-specified Optimization

Pretrained multilingual encoders enable zero-shot cross-lingual transfer...
research
09/12/2023

Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies

The cross-lingual transfer is a promising technique to solve tasks in le...
research
06/11/2020

CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP

Multi-lingual contextualized embeddings, such as multilingual-BERT (mBER...
research
06/05/2023

Cross-Lingual Transfer with Target Language-Ready Task Adapters

Adapters have emerged as a modular and parameter-efficient approach to (...
research
09/02/2021

MultiEURLEX – A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

We introduce MULTI-EURLEX, a new multilingual dataset for topic classifi...
research
03/30/2023

Fine-Tuning BERT with Character-Level Noise for Zero-Shot Transfer to Dialects and Closely-Related Languages

In this work, we induce character-level noise in various forms when fine...

Please sign up or login with your details

Forgot password? Click here to reset