Arbitrary-shaped scene text detection by predicting distance map

Natural scene text detection is a challenging task, and the existing quadrilateral bounding box regression-based methods enable the location of horizontal and multi-oriented texts but have great difficulties in locating arbitrary-shaped texts due to the limited shape of the quadrilateral bounding bo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-09, Vol.52 (12), p.14374-14386
Hauptverfasser:	Wang, Xinyu, Yi, Yaohua, Peng, Jibing, Wang, Kaili
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Artificial Intelligence Boundaries Boxes Computer Science Datasets Horizontal orientation Machines Manufacturing Mechanical Engineering Methods Processes Quadrilaterals Sensors Texts
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Natural scene text detection is a challenging task, and the existing quadrilateral bounding box regression-based methods enable the location of horizontal and multi-oriented texts but have great difficulties in locating arbitrary-shaped texts due to the limited shape of the quadrilateral bounding box template. Previous segmentation-based methods, which conduct pixel-level classification and separate adjacent texts by predicting center lines with fixed widths, are able to locate the boundaries of arbitrary-shaped texts. However, the detected text regions may stick together or break into multiple areas with sub-optimal results while the width of the center lines is not appropriate. In this paper, a novel natural scene text detector based on distance map is proposed. The method can detect arbitrary-shaped texts more flexibly and robustly by adjusting the width of the center line. Experimental results on several datasets demonstrate that the proposed method is more competitive than the methods based on fixed-width center lines and obtains state-of-the-art or comparable performance on CTW1500, ICDAR2015 and Total-Text. Notably, the proposed method achieves F-measures of 85.4% on the ICDAR 2015 dataset and 81.6% on the Total-Text dataset. Code is available at: https://github.com/Whu-wxy/DistNet .
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-021-03065-z