Normalization of unconstrained handwritten words in terms of Slope and Slant Correction

•Approximation of core-region of a skewed word using a novel Component Smearing by Concentric Squares (CSCS) algorithm.•Estimation of slope angle depending on the two profiles namely, top and bottom profiles.•Development of a novel approach for detecting the Most Significant Area (MSA) of a word ima...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition letters 2019-12, Vol.128, p.488-495
Hauptverfasser: Bera, Suman Kumar, Chakrabarti, Akash, Lahiri, Sagnik, Barney Smith, Elisa H., Sarkar, Ram
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Approximation of core-region of a skewed word using a novel Component Smearing by Concentric Squares (CSCS) algorithm.•Estimation of slope angle depending on the two profiles namely, top and bottom profiles.•Development of a novel approach for detecting the Most Significant Area (MSA) of a word image.•Designing a new technique to identify most contributory strokes for slant angle estimation.•Preparation of four datasets along with corresponding ground truth information to make avail publically. In offline handwritten text slope (or skew) and slant are inevitably introduced, but to varying degrees depending on several factors, such as the writing style, speed and mood of the writers. Therefore slope and slant detection in offline handwritten text and their subsequent correction have become the critical pre-processing steps for document analysis and retrieval systems to neutralize the variability of writing styles and to improve the performance of word and character recognition systems. In this paper, we present new methods that use two novel core-region detection techniques to estimate both the slope and slant angles of offline handwritten word images. Also we prepare multilingual datasets comprised of both real and synthetic handwritten word images, along with ground truth information related to the slope and slant of each word, to address the lack of standard datasets for this research. These datasets of Bangla, Devanagari and English words along with the code are made publicly available. Extensive experimental results prove the efficacy of the proposed methods compared to contemporary state-of-the-art methods. Moreover, the methods are robust, efficient, and easily implementable. (The code and datasets are available at: https://scholarworks.boisestate.edu/saipl/)
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2019.10.025