Scene text image super-resolution via textual reasoning and multiscale cross-convolution

Scene text image super-resolution aims to upgrade the visual quality of low-resolution images and contributes to the accuracy of the subsequent scene text recognition task. However, advanced super-resolution methods with more attention to text-oriented information still have challenges in extremely...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024, Vol.54 (2), p.1997-2008
Hauptverfasser: Yu, Lan, Li, Xiaojie, Yu, Qi, Li, Guangju, Jin, Dehu, Qi, Meng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Scene text image super-resolution aims to upgrade the visual quality of low-resolution images and contributes to the accuracy of the subsequent scene text recognition task. However, advanced super-resolution methods with more attention to text-oriented information still have challenges in extremely blurred images. To address this problem, we propose a novel network based on textual reasoning and multiscale cross-convolution (TRMCC), in which a text structure preservation module is designed to explore the correlation of horizontal features among layers to enhance the structural similarity between the reconstructions and the corresponding high-resolution (HR) images and the multiscale cross-convolution block explores structural information hierarchically in layers with various perceptual fields in a progressive manner. In addition, based on human behavior in the presence of blurred images with linguistic rules, the text semantic reasoning module incorporated a self-attention mechanism and language-based textual reasoning to improve the accuracy of textual prior information. Comprehensive experiments conducted on the real-scene text image dataset TextZoom demonstrated the superiority of our model compared with existing state-of-the-art models, especially on structural similarity and information integrity.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-023-05251-7