FAST: Facilitated and Accurate Scene Text Proposals through FCN Guided Pruning

•We train a Fully Convolutional Network (FCN) for text prediction in scene images and fuse it with a text proposal method.•Significantly higher recall rates than SoA text localization pipelines and better quality regions are obtained.•The resulting pipeline reduces the number of proposals resulting...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition letters 2019-03, Vol.119, p.112-120
Hauptverfasser: Bazazian, Dena, Gómez, Raúl, Nicolaou, Anguelos, Gómez, Lluís, Karatzas, Dimosthenis, Bagdanov, Andrew D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We train a Fully Convolutional Network (FCN) for text prediction in scene images and fuse it with a text proposal method.•Significantly higher recall rates than SoA text localization pipelines and better quality regions are obtained.•The resulting pipeline reduces the number of proposals resulting to a 4 ×  speed up compared with the baseline.•Our proposed method yields top performance when integrated in an end-to-end pipeline.•Analysis and results on standard datasets COCO-Text and ICDAR-Challenge 4 are reported. [Display omitted] This paper proposes a fusion of a text proposal technique with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same text recall level and thus gaining a significant speed up.Different fusion strategies are explored, as shown in the figure, that demonstrate that such an approach can yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. End-to-end performance shows that this recall margin leads to state-of-the-art results in scene text reading systems. Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2017.08.030