Compact and Accurate Scene Text Detector

Scene text detection is the task of detecting word boxes in given images. The accuracy of text detection has been greatly elevated using deep learning models, especially convolutional neural networks. Previous studies commonly aimed at developing more accurate models, but their models became computa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2020-03, Vol.10 (6), p.2096
Hauptverfasser:	Jeon, Minjun, Jeong, Young-Seob
Format:	Artikel
Sprache:	eng
Schlagworte:	convolutional neural network Efficiency efficient scene text detection Floating point arithmetic Image detection inverted residual block Neural networks Sensors Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Scene text detection is the task of detecting word boxes in given images. The accuracy of text detection has been greatly elevated using deep learning models, especially convolutional neural networks. Previous studies commonly aimed at developing more accurate models, but their models became computationally heavy and worse in efficiency. In this paper, we propose a new efficient model for text detection. The proposed model, namely Compact and Accurate Scene Text detector (CAST), consists of MobileNetV2 as a backbone and balanced decoder. Unlike previous studies that used standard convolutional layers as a decoder, we carefully design a balanced decoder. Through experiments with three well-known datasets, we then demonstrated that the balanced decoder and the proposed CAST are efficient and effective. The CAST was about 1.1x worse in terms of the F1 score, but 30∼115x better in terms of floating-point operations per second (FLOPS).
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app10062096