Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation

Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal on document analysis and recognition 2014-03, Vol.17 (1), p.19-31
Hauptverfasser:	Elagouni, Khaoula, Garcia, Christophe, Mamalet, Franck, Sébillot, Pascale
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Artificial intelligence Computer Science Computer science control theory systems Connectionism. Neural networks Document and Text Processing Engineering Sciences Exact sciences and technology Image Processing and Computer Vision Multimedia Original Paper Pattern Recognition Pattern recognition. Digital image processing. Computational geometry Signal and Image Processing Speech and sound recognition and synthesis. Linguistics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Text embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based optical character recognition (OCR) systems that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods.
ISSN:	1433-2833 1433-2825
DOI:	10.1007/s10032-013-0202-7