Statistical Parsing by Machine Learning from a Classical Arabic Treebank

10/25/2015
by   Kais Dukes, et al.
0

Research into statistical parsing for English has enjoyed over a decade of successful results. However, adapting these models to other languages has met with difficulties. Previous comparative work has shown that Modern Arabic is one of the most difficult languages to parse due to rich morphology and free word order. Classical Arabic is the ancient form of Arabic, and is understudied in computational linguistics, relative to its worldwide reach as the language of the Quran. The thesis is based on seven publications that make significant contributions to knowledge relating to annotating and parsing Classical Arabic. A central argument of this thesis is that using a hybrid representation closely aligned to traditional grammar leads to improved parsing for Arabic. To test this hypothesis, two approaches are compared. As a reference, a pure dependency parser is adapted using graph transformations, resulting in an 87.47 F1-score of 89.03 better suited to Classical Arabic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2020

I3rab: A New Arabic Dependency Treebank Based on Arabic Grammatical Theory

Treebanks are valuable linguistic resources that include the syntactic s...
research
01/29/2019

An Arabic Dependency Treebank in the Travel Domain

In this paper we present a dependency treebank of travel domain sentence...
research
12/13/2014

A Study of Sindhi Related and Arabic Script Adapted languages Recognition

A large number of publications are available for the Optical Character R...
research
10/24/2022

Maknuune: A Large Open Palestinian Arabic Lexicon

We present Maknuune, a large open lexicon for the Palestinian Arabic dia...
research
04/25/2021

Transformers to Fight the COVID-19 Infodemic

The massive spread of false information on social media has become a glo...
research
10/21/2022

Graphemic Normalization of the Perso-Arabic Script

Since its original appearance in 1991, the Perso-Arabic script represent...
research
08/10/2018

Hybrid approach for transliteration of Algerian arabizi: a primary study

A hybrid approach for the transliteration of Algerian Arabizi: A primary...

Please sign up or login with your details

Forgot password? Click here to reset