Pseudo two-dimensional hidden Markov models for document recognition

Hidden Markov models (HMM) have become the most popular technique for automatic speech recognition. Extending this technique to the two-dimensional domain is a promising approach to solving difficult problems in optical character recognition (OCR), such as recognizing poorly printed text. Hidden Mar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:AT&T Technical Journal 1993-09, Vol.72 (5), p.60-72
Hauptverfasser: Agazzi, Oscar E., Kuo, Shyh-shiaw
Format: Artikel
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Hidden Markov models (HMM) have become the most popular technique for automatic speech recognition. Extending this technique to the two-dimensional domain is a promising approach to solving difficult problems in optical character recognition (OCR), such as recognizing poorly printed text. Hidden Markov models are robust for OCR applications due to: - Their inherent tolerance to noise and distortion, - Their ability to segment blurred and connected text into words and characters as an integral part of the recognition process, - Their invariance to size, slant, and other transformations of the basic characters, and - The ease with which contextual information and language models can be incorporated into the recognition algorithms. We give a brief overview of OCR algorithms based on two-dimensional hidden Markov models, and we present three case studies that show their remarkable strengths.
ISSN:8756-2324
2376-676X
1538-7305
DOI:10.1002/j.1538-7305.1993.tb00655.x