A Transformer-based Approach for Arabic Offline Handwritten Text Recognition

07/27/2023
by   Saleh Momeni, et al.
0

Handwriting recognition is a challenging and critical problem in the fields of pattern recognition and machine learning, with applications spanning a wide range of domains. In this paper, we focus on the specific issue of recognizing offline Arabic handwritten text. Existing approaches typically utilize a combination of convolutional neural networks for image feature extraction and recurrent neural networks for temporal modeling, with connectionist temporal classification used for text generation. However, these methods suffer from a lack of parallelization due to the sequential nature of recurrent neural networks. Furthermore, these models cannot account for linguistic rules, necessitating the use of an external language model in the post-processing stage to boost accuracy. To overcome these issues, we introduce two alternative architectures, namely the Transformer Transducer and the standard sequence-to-sequence Transformer, and compare their performance in terms of accuracy and speed. Our approach can model language dependencies and relies only on the attention mechanism, thereby making it more parallelizable and less complex. We employ pre-trained Transformers for both image understanding and language modeling. Our evaluation on the Arabic KHATT dataset demonstrates that our proposed method outperforms the current state-of-the-art approaches for recognizing offline Arabic handwritten text.

READ FULL TEXT
research
09/21/2021

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Text recognition is a long-standing research problem for document digita...
research
05/15/2017

Handwritten Urdu Character Recognition using 1-Dimensional BLSTM Classifier

The recognition of cursive script is regarded as a subtle task in optica...
research
09/10/2013

A multi-stream hmm approach to offline handwritten arabic word recognition

In This paper we presented new approach for cursive Arabic text recognit...
research
11/06/2021

CALText: Contextual Attention Localization for Offline Handwritten Text

Recognition of Arabic-like scripts such as Persian and Urdu is more chal...
research
12/31/2020

AraGPT2: Pre-Trained Transformer for Arabic Language Generation

Recently, pretrained transformer-based architectures have proven to be v...
research
12/15/2014

CITlab ARGUS for Arabic Handwriting

In the recent years it turned out that multidimensional recurrent neural...
research
05/26/2020

Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition

The advent of recurrent neural networks for handwriting recognition mark...

Please sign up or login with your details

Forgot password? Click here to reset