AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing

06/11/2023
by   Asaad AlGhamdi, et al.
0

Developing monolingual large Pre-trained Language Models (PLMs) is shown to be very successful in handling different tasks in Natural Language Processing (NLP). In this work, we present AraMUS, the largest Arabic PLM with 11B parameters trained on 529GB of high-quality Arabic textual data. AraMUS achieves state-of-the-art performances on a diverse set of Arabic classification and generative tasks. Moreover, AraMUS shows impressive few-shot learning abilities compared with the best existing Arabic PLMs.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset