Integrated algorithms for newspaper page decomposition and article tracking
The conversion of newspaper pages into digital resources is an important task that greatly contributes to the preservation of and access to newspaper archives. In this paper, an integrated methodology is presented for segmenting newspaper pages and identifying newspaper articles. In the first stage,...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The conversion of newspaper pages into digital resources is an important task that greatly contributes to the preservation of and access to newspaper archives. In this paper, an integrated methodology is presented for segmenting newspaper pages and identifying newspaper articles. In the first stage, a succession of image processing and document analysis algorithms is employed for segmenting newspaper page images into various objects (text, images and drawings, titles). A rule based approach is subsequently applied to the objects identified during the page segmentation phase for reconstructing individual articles. Experimental results, obtained from a large testbed of old newspaper issues, are presented which clearly demonstrate the applicability of our integrated approach to successful newspaper page segmentation and identification of newspaper articles. |
---|---|
DOI: | 10.1109/ICDAR.1999.791849 |