A model for the gray-intensity distribution of historical handwritten documents and its application for binarization

In this article, our goal is to describe mathematically and experimentally the gray-intensity distributions of the fore- and background of handwritten historical documents. We propose a local pixel model to explain the observed asymmetrical gray-intensity histograms of the fore- and background. Our...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal on document analysis and recognition 2014-06, Vol.17 (2), p.139-160
Hauptverfasser: Ramírez-Ortegón, Marte A., Ramírez-Ramírez, Lilia L., Messaoud, Ines Ben, Märgner, Volker, Cuevas, Erik, Rojas, Raúl
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this article, our goal is to describe mathematically and experimentally the gray-intensity distributions of the fore- and background of handwritten historical documents. We propose a local pixel model to explain the observed asymmetrical gray-intensity histograms of the fore- and background. Our pixel model states that, locally, the gray-intensity histogram is the mixture of gray-intensity distributions of three pixel classes. Following our model, we empirically describe the smoothness of the background for different types of images. We show that our model has potential application in binarization. Assuming that the parameters of the gray-intensity distributions are correctly estimated, we show that thresholding methods based on mixtures of lognormal distributions outperform thresholding methods based on mixtures of normal distributions. Our model is supported with experimental tests that are conducted with extracted images from DIBCO 2009 and H-DIBCO 2010 benchmarks. We also report results for all four DIBCO benchmarks.
ISSN:1433-2833
1433-2825
DOI:10.1007/s10032-013-0212-5