Unsupervised training of character templates using unsegmented samples

A method for operating a machine to perform unsupervised training of a set of character templates uses as the source of training samples an image source of character images, called glyphs, that need not be manually or automatically segmented or isolated prior to training. A recognition operation per...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHOU, PHILIP ANDREW, KOPEC, GARY E
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A method for operating a machine to perform unsupervised training of a set of character templates uses as the source of training samples an image source of character images, called glyphs, that need not be manually or automatically segmented or isolated prior to training. A recognition operation performed on the image source of character images produces a labeled glyph position data structure that includes, for each glyph in the image source, a glyph image position in the image source associating an estimated image location of the glyph in the image source with a character label paired with the glyph image position that indicates the character in the character set being trained. The labeled glyph position data and the image source are then used to determine sample image regions in the image source; each sample image region is large enough to contain at least a single glyph but need not be restricted in size to only contain a single glyph. The template construction process using unsegmented samples is mathematically modeled as an optimization problem that optimizes a function that represents the set of character templates being trained as an ideal image to be reconstructed to match the input image. The method produces all of the character templates substantially contemporaneously by using a novel pixel scoring technique that implements an approximation of a maximum likelihood criterion subject to a constraint on the templates produced which holds that foreground pixels in adjacently positioned character images have substantially nonoverlapping foreground pixels. The character templates produced may be binary templates or arrays of probability values.