HFENet: Hybrid Feature Enhancement Network for Detecting Texts in Scenes and Traffic Panels

Text detection in complex scene images is a challenging task for intelligent transportation. Existing scene text detection methods often adopt multi-scale feature learning strategies to extract informative feature representations for covering objects of various sizes. However, the sampling operation...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2023-12, Vol.24 (12), p.14200-14212
Hauptverfasser: Liang, Min, Zhu, Xiaobin, Zhou, Hongyang, Qin, Jingyan, Yin, Xu-Cheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Text detection in complex scene images is a challenging task for intelligent transportation. Existing scene text detection methods often adopt multi-scale feature learning strategies to extract informative feature representations for covering objects of various sizes. However, the sampling operation inherent in multi-scale feature generation can easily impair high-frequency details (e.g., textures and boundaries), which are critical for text detection. In this work, we propose an innovative Hybrid Feature Enhancement Network (dubbed HFENet) to explicitly improve the quality of high-frequency information for detecting texts in scenes and traffic panels. To be concrete, we propose a simple yet effective self-guided feature enhancement module (SFEM) for globally lifting feature representations to highly discriminative and high-frequency abundant ones. Notably, our SFEM is pluggable and will be removed after training without introducing extra computational costs. In addition, due to the challenge and importance of accurately predicting boundaries for text detection, we propose a novel boundary enhancement module (BEM) to explicitly strengthen local feature representations in the guidance of boundary annotation for accurate localization. Extensive experiments on multiple publicly available datasets (i.e., MSRA-TD500, CTW1500, Total-Text, Traffic Guide Panel Dataset, Chinese Road Plate Dataset, and ASAYAR_TXT) verify the state-of-the-art performance of our method.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2023.3305686