Self-training for handwritten word recognition and retrieval

Handwritten text recognition and Word Retrieval, also known as Word Spotting, are traditional problems in the document analysis community. While the use of increasingly large neural network architectures has led to a steady improvement of performances it comes with the drawback of requiring manually...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal on document analysis and recognition 2024-09, Vol.27 (3), p.225-244
Hauptverfasser: Wolf, Fabian, Fink, Gernot A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Handwritten text recognition and Word Retrieval, also known as Word Spotting, are traditional problems in the document analysis community. While the use of increasingly large neural network architectures has led to a steady improvement of performances it comes with the drawback of requiring manually annotated training data. This poses a tremendous problem considering their application to new document collections. To overcome this drawback, we propose a self-training approach that allows to train state-of-the-art models for HTR and word spotting. Self-training is a common technique in semi-supervised learning and usually relies on a small labeled dataset and training on pseudo-labels generated by an initial model. In this work, we show that it is feasible to train models on synthetic data that are sufficiently performant to serve as initial models for self-training. Therefore, the proposed training method does not rely on any manually annotated samples. We further investigate visual and language properties of the synthetic datasets. In order to improve performance and robustness of the self-training approach, we propose different confidence measures for both models that allow to identify and remove erroneous pseudo-labels. The presented training approach clearly outperforms other learning-free methods or adaptation strategies under the absence of manually annotated data.
ISSN:1433-2833
1433-2825
DOI:10.1007/s10032-024-00484-9