Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya

06/13/2020
by   Abrhalei Tela, et al.
0

In recent years, transformer models have achieved great success in natural language processing (NLP) tasks. Most of the current state-of-the-art NLP results are achieved by using monolingual transformer models, where the model is pre-trained using a single language unlabelled text corpus. Then, the model is fine-tuned to the specific downstream task. However, the cost of pre-training a new transformer model is high for most languages. In this work, we propose a cost-effective transfer learning method to adopt a strong source language model, trained from a large monolingual corpus to a low-resource language. Thus, using XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for low-resourced language Tigrinya. With only 10k examples of the given Tigrinya sentiment analysis dataset, English XLNet has achieved 78.88 outperforming BERT and mBERT by 10 fine-tuning (English) XLNet model on the CLS dataset has promising results compared to mBERT and even outperformed mBERT for one dataset of the Japanese language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2022

TiBERT: Tibetan Pre-trained Language Model

The pre-trained language model is trained on large-scale unlabeled text ...
research
09/30/2021

SlovakBERT: Slovak Masked Language Model

We introduce a new Slovak masked language model called SlovakBERT in thi...
research
10/25/2022

Progressive Sentiment Analysis for Code-Switched Text Data

Multilingual transformer language models have recently attracted much at...
research
04/25/2023

KINLP at SemEval-2023 Task 12: Kinyarwanda Tweet Sentiment Analysis

This paper describes the system entered by the author to the SemEval-202...
research
06/23/2021

Classifying Textual Data with Pre-trained Vision Models through Transfer Learning and Data Transformations

Knowledge is acquired by humans through experience, and no boundary is s...
research
09/04/2022

Quantitative Stopword Generation for Sentiment Analysis via Recursive and Iterative Deletion

Stopwords carry little semantic information and are often removed from t...
research
04/11/2022

A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis

Sentiment analysis is an important task in natural language processing. ...

Please sign up or login with your details

Forgot password? Click here to reset