Two-stage approach to extracting visual objects from paper documents

In the paper we present an approach to the automatic detection and identification of important elements in paper documents. This includes stamps, logos, printed text blocks, signatures and tables. Presented approach consists of two stages. The first one includes object detection by means of AdaBoost...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine vision and applications 2016-11, Vol.27 (8), p.1243-1257
Hauptverfasser: Forczmański, Paweł, Markiewicz, Andrzej
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the paper we present an approach to the automatic detection and identification of important elements in paper documents. This includes stamps, logos, printed text blocks, signatures and tables. Presented approach consists of two stages. The first one includes object detection by means of AdaBoost cascade of weak classifiers and Haar-like features. Resulting image blocks are, at the second stage, subjected to verification based on selected features calculated from recently proposed low-level descriptors combined with certain classifiers representing current machine-learning approaches. The training phase, for both stages, uses bootstrapping, i.e., integrative process, aiming at increasing the accuracy. Experiments performed on large set of digitized paper documents showed that adopted strategy is useful and efficient.
ISSN:0932-8092
1432-1769
DOI:10.1007/s00138-016-0803-5