Enriching Documents with Context Terms from Cross-Domain Ontologies

Entity-centric search has become a demanding problem for many domains on the Web. In particular, the suitable contextualization of result documents poses challenges in terms of selecting most adequate indexing terms for later retrieval. This holds even more, if no generally recognized ontologies for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information and Media Technologies 2015, Vol.10(2), pp.294-304
Hauptverfasser: KÖHNCKE, Benjamin, BALKE, Wolf-Tilo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Entity-centric search has become a demanding problem for many domains on the Web. In particular, the suitable contextualization of result documents poses challenges in terms of selecting most adequate indexing terms for later retrieval. This holds even more, if no generally recognized ontologies for the respective domain are available. In this paper, we show that cross-domain ontology terms are actually more useful for indexing, than salient keywords taken from the documents. Moreover, learning typical contexts for groups of entities from collections indexed by strong cross-domain ontologies can considerably improve retrieval effectiveness. Our extensive experiments prove these results on real world document collections from the area of chemistry and computer science. In fact, our evaluation in different document retrieval scenarios show a vital increase of retrieval precision of up to 87% using documents annotated with cross-domain ontology terms as compared to 53% for BM25 searches and 43% for documents annotated with Wikipedia categories.
ISSN:1881-0896
DOI:10.11185/imt.10.294