Multi-scale Gated Fully Convolutional DenseNets for semantic labeling of historical newspaper images
•A new fully convolutional network for historical newspaper semantic labeling.•A multi-scale gate block inspired by dense nets and gating mechanism is proposed.•A multi-scale analysis strengthens the recognition of fields of various sizes.•A gating mechanism provides a focus on features of interest....
Gespeichert in:
Veröffentlicht in: | Pattern recognition letters 2020-03, Vol.131, p.435-441 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •A new fully convolutional network for historical newspaper semantic labeling.•A multi-scale gate block inspired by dense nets and gating mechanism is proposed.•A multi-scale analysis strengthens the recognition of fields of various sizes.•A gating mechanism provides a focus on features of interest.
Historical newspaper image analysis is a challenging task due to the complex layout of newspapers and its variability among collections. While traditional approaches are rule-based methods with many successive steps, recent works show that deep learning approaches can be successfully used to provide a pixel labeling of the various fields occurring in a page. This allows the automatic extraction of the document structure and accessing the different semantic entities. Recent improvements proposed to strengthen convolutional neural network capacities such as gated mechanism may also apply well to to task at end. In this respect, we propose a fully convolutional neural network architecture (FCN) that outputs a pixel-labeling of the various semantic entities that occur in historical newspaper images. Our model is based on a novel Multi-Scale Gated Block architecture (MSGB), made of dense connections and gating mechanisms that handle a multi-scale analysis of the input image with self-attention. Evaluations conducted on 4 historical newspaper datasets including up to 11 semantic classes show that our proposition outperforms standard FCN architectures. |
---|---|
ISSN: | 0167-8655 1872-7344 |
DOI: | 10.1016/j.patrec.2020.01.026 |