On the generalization of the form identification and skew detection problem
A new method is proposed to solve the document identification and skew detection problem. It can be applied to a widely used subclass of documents which resemble in style an application form. Unlike other approaches, we make no assumptions about the nature and/or style of the printed form. An attemp...
Gespeichert in:
Veröffentlicht in: | Pattern recognition 2002, Vol.35 (1), p.253-264 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A new method is proposed to solve the document identification and skew detection problem. It can be applied to a widely used subclass of documents which resemble in style an application form. Unlike other approaches, we make no assumptions about the nature and/or style of the printed form. An attempt is made to solve the problem in the most general sense. The method presented here does not rely on any special features such as patterns of line crossings, or dominant lines, or even special symbols found only on specially designed forms. The Power Spectral Density of the horizontal projection profile of the form is used as a shift invariant feature vector. The Karhunen–Loeve transform is employed to de-correlate and reduce the length of the feature vectors in the training set. Training is done in such a way that no rotations of the unknown form are necessary during recognition. The eigenvectors of the covariance matrix of the power spectral densities for the training set, along with learning vector quantization, were used for training, and the Euclidean distance, for recognition. A limitation related to the amount of skew that the system can handle is alleviated with the use of a known skew detection method. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/S0031-3203(01)00030-9 |