ANER: Arabic and Arabizi Named Entity Recognition using Transformer-Based Approach

One of the main tasks of Natural Language Processing (NLP), is Named Entity Recognition (NER). It is used in many applications and also can be used as an intermediate step for other tasks. We present ANER, a web-based named entity recognizer for the Arabic, and Arabizi languages. The model is built upon BERT, which is a transformer-based encoder. It can recognize 50 different entity classes, covering various fields. We trained our model on the WikiFANE_Gold dataset which consists of Wikipedia articles. We achieved an F1 score of 88.7%, which beats CAMeL Tools' F1 score of 83% on the ANERcorp dataset, which has only 4 classes. We also got an F1 score of 77.7% on the NewsFANE_Gold dataset which contains out-of-domain data from News articles. The system is deployed on a user-friendly web interface that accepts users' inputs in Arabic, or Arabizi. It allows users to explore the entities in the text by highlighting them. It can also direct users to get information about entities through Wikipedia directly. We added the ability to do NER using our model, or CAMeL Tools' model through our website. ANER is publicly accessible at <http://www.aner.online>. We also deployed our model on HuggingFace at https://huggingface.co/boda/ANER, to allow developers to test and use it.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2022

Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT

This paper presents Wojood, a corpus for Arabic nested Named Entity Reco...
research
04/25/2020

A Named Entity Based Approach to Model Recipes

Traditional cooking recipes follow a structure which can be modelled ver...
research
05/12/2022

Comparing Open Arabic Named Entity Recognition Tools

The main objective of this paper is to compare and evaluate the performa...
research
10/06/2022

HealthE: Classifying Entities in Online Textual Health Advice

The processing of entities in natural language is essential to many medi...
research
05/30/2023

Machine Learning Approach for Cancer Entities Association and Classification

According to the World Health Organization (WHO), cancer is the second l...
research
12/21/2020

Domain specific BERT representation for Named Entity Recognition of lab protocol

Supervised models trained to predict properties from representations hav...
research
02/26/2020

Detecting Potential Topics In News Using BERT, CRF and Wikipedia

For a news content distribution platform like Dailyhunt, Named Entity Re...

Please sign up or login with your details

Forgot password? Click here to reset