Setting up a competition framework for the evaluation of structure extraction from OCR-ed books

This paper describes the setup of the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition was to evaluate and compare automatic techniques for deriving structure information from digitized books, which could then be used to aid navigation inside the books. More speci...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal on document analysis and recognition 2011-03, Vol.14 (1), p.45-52
Hauptverfasser: Doucet, Antoine, Kazai, Gabriella, Dresevic, Bodin, Uzelac, Aleksandar, Radakovic, Bogdan, Todic, Nikola
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper describes the setup of the Book Structure Extraction competition run at ICDAR 2009. The goal of the competition was to evaluate and compare automatic techniques for deriving structure information from digitized books, which could then be used to aid navigation inside the books. More specifically, the task that participants faced was to construct hyperlinked tables of contents for a collection of 1,000 digitized books. This paper describes the setup of the competition and its challenges. It introduces and discusses the book collection used in the task, the collaborative construction of the ground truth, the evaluation measures, and the evaluation results. The paper also introduces a data set to be used freely for research evaluation purposes.
ISSN:1433-2833
1433-2825
DOI:10.1007/s10032-010-0127-3