DC-PSENet: a novel scene text detection method integrating double ResNet-based and changed channels recursive feature pyramid
Due to the emergence and advancement of deep learning technologies, scene text detection is becoming more widespread in various fields. However, due to the complexity of distances, angles and backgrounds, the adjacent texts in images have the problem that the detection boxes are far away from the te...
Gespeichert in:
Veröffentlicht in: | The Visual computer 2024-06, Vol.40 (6), p.4473-4491 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Due to the emergence and advancement of deep learning technologies, scene text detection is becoming more widespread in various fields. However, due to the complexity of distances, angles and backgrounds, the adjacent texts in images have the problem that the detection boxes are far away from the texts, i.e., a position is not accurate enough. In this paper, we propose a text detection method centered on double ResNet-based and changed channels recursive feature pyramid, which integrates ResNet50-Mish and Res2Net50-Mish, as well as using recursive feature pyramid with changed channels. Firstly, scene images are fed into ResNet50-Mish and Res2Net50-Mish of double ResNet-based, and results are passed through a weight-based addition step to generate the fused feature maps. Secondly, the processed feature maps of double ResNet-based are sent into changed channels recursive feature pyramid to obtain feature maps with enhanced feature information. Also, the relevant segmentation results are then obtained by concatenating and convoluting. Finally, the results are given to progressive scale expansion algorithm to output the location of texts in images. The proposed model is trained and tested on ICDAR15 and CTW1500 benchmark datasets. In terms of precision values, our method outperforms or is comparable to state-of-the-art methods. In particular, experimental results achieve 91.53% precision on ICDAR15 dataset and 84.89% precision on CTW-1500 dataset. |
---|---|
ISSN: | 0178-2789 1432-2315 |
DOI: | 10.1007/s00371-023-03093-5 |