A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution
Scene text image super-resolution aims to increase the resolution and readability of the text in low-resolution images. Though significant improvement has been achieved by deep convolutional neural networks (CNNs), it remains difficult to reconstruct high-resolution images for spatially deformed tex...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scene text image super-resolution aims to increase the resolution and
readability of the text in low-resolution images. Though significant
improvement has been achieved by deep convolutional neural networks (CNNs), it
remains difficult to reconstruct high-resolution images for spatially deformed
texts, especially rotated and curve-shaped ones. This is because the current
CNN-based methods adopt locality-based operations, which are not effective to
deal with the variation caused by deformations. In this paper, we propose a CNN
based Text ATTention network (TATT) to address this problem. The semantics of
the text are firstly extracted by a text recognition module as text prior
information. Then we design a novel transformer-based module, which leverages
global attention mechanism, to exert the semantic guidance of text prior to the
text reconstruction process. In addition, we propose a text structure
consistency loss to refine the visual appearance by imposing structural
consistency on the reconstructions of regular and deformed texts. Experiments
on the benchmark TextZoom dataset show that the proposed TATT not only achieves
state-of-the-art performance in terms of PSNR/SSIM metrics, but also
significantly improves the recognition accuracy in the downstream text
recognition task, particularly for text instances with multi-orientation and
curved shapes. Code is available at https://github.com/mjq11302010044/TATT. |
---|---|
DOI: | 10.48550/arxiv.2203.09388 |