Understanding movie poster: transfer-deep learning approach for graphic-rich text recognition

Graphic-rich texts are common in posters. In a movie poster, information, such as movie title, tag lines, and names of the actors, director, and production house, is available. Graphic-rich texts in movie titles represent not only sentiments but also their genre. Understanding the poster requires gr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2022-05, Vol.38 (5), p.1645-1664
Hauptverfasser: Ghosh, Mridul, Roy, Sayan Saha, Mukherjee, Himadri, Obaidullah, Sk Md, Santosh, K. C., Roy, Kaushik
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graphic-rich texts are common in posters. In a movie poster, information, such as movie title, tag lines, and names of the actors, director, and production house, is available. Graphic-rich texts in movie titles represent not only sentiments but also their genre. Understanding the poster requires graphic-rich text recognition. Prior to that, one requires text localization, so background and foreground graffiti can be well segmented. In this paper, we propose a transfer learning-based approach for graphic-rich text localization, which was tuned by introducing reverse augmentation and rotated/inclined rectangle drawing technique. A convolution neural network-based model is then applied to identify their corresponding scripts. In our experiments, on a newly developed dataset (available upon request) that is composed of movie posters with multiple scripts of 1154 images, we achieved an average accuracy of 99.30%. Our results outperformed previously developed tools that are relying on handcrafted features.
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-021-02094-6