A methodology to spot words in historical Arabic documents

Libraries contain huge amounts of Arabic printed historical documents which cannot be available on-line because they do not have a searchable index. The word spotting idea has previously been suggested as a solution to create indexes for such a collection of documents by matching word images. In thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zirari, F., Ennaji, A., Nicolas, S., Mammass, D.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Libraries contain huge amounts of Arabic printed historical documents which cannot be available on-line because they do not have a searchable index. The word spotting idea has previously been suggested as a solution to create indexes for such a collection of documents by matching word images. In this paper we present a word spotting method for Arabic printed historical document. We start with word segmentation using run length smoothing algorithm. The description of the features selected to represent the words images is given afterwards. Elastic Dynamic Time Warping is used for matching the features of the two words. This method was tested on the Arabic historical printed document database of Moroccan National Library (MNL).
ISSN:2161-5322
2161-5330
DOI:10.1109/AICCSA.2013.6616492