Stroke Width-Based Contrast Feature for Document Image Binarization

Automatic segmentation of foreground text from the background in degradeddocument images is very much essential for the smooth reading of the document contentand recognition tasks by machine. In this paper, we present a novel approach to thebinarization of degraded document images. The proposed meth...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Information Processing Systems 2014, 10(1), 31, pp.55-68
Hauptverfasser: Van, Le Thi Khue, Lee, Gueesang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Automatic segmentation of foreground text from the background in degradeddocument images is very much essential for the smooth reading of the document contentand recognition tasks by machine. In this paper, we present a novel approach to thebinarization of degraded document images. The proposed method uses a new localcontrast feature extracted based on the stroke width of text. First, a pre-processingmethod is carried out for noise removal. Text boundary detection is then performed on theimage constructed from the contrast feature. Then local estimation follows to extract textfrom the background. Finally, a refinement procedure is applied to the binarized image asa post-processing step to improve the quality of the final results. Experiments andcomparisons of extracting text from degraded handwriting and machine-printed documentimage against some well-known binarization algorithms demonstrate the effectiveness ofthe proposed method. KCI Citation Count: 0
ISSN:1976-913X
2092-805X
DOI:10.3745/JIPS.2014.10.1.055