A Novel Arabic Optical Character Recognition Approach Based on Levenshtein Distance

Arabic handwritten character recognition (AHCR) is the process of automatically identifying and recognizing handwritten Arabic characters. This is a challenging task due to the complexity of the Arabic script, which includes a large number of characters with complex shapes and ligatures. In this pap...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Automatic control and computer sciences 2024-10, Vol.58 (5), p.519-529
Hauptverfasser: Walid Fakhet, El Khediri, Salim, Zidi, Salah
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Arabic handwritten character recognition (AHCR) is the process of automatically identifying and recognizing handwritten Arabic characters. This is a challenging task due to the complexity of the Arabic script, which includes a large number of characters with complex shapes and ligatures. In this paper, we present a novel approach based on Levenshtein distance to recognize Arabic handwritten characters by combining the classification and the postprocessing phases. To train the proposed model, we created an Arabic optical character recognition (OCR) context database divided into multiple text files. Each file in the database belongs to one of five well-defined contexts: sport, economy, religion, politics, and culture. The total number of words in each file is 15 000. The experiment results show that the new method outperforms the state-of-the-art approach. The error rate achieved by using 15 000 words was 1.2%.
ISSN:0146-4116
1558-108X
DOI:10.3103/S0146411624700639