Self Managing Top-k (Summary, Keyword) Indexes in XML Retrieval

Retrieval queries that combine structural constraints with keyword search represent a significant challenge to XML data management systems. Queries are expected to be answered as efficiently and effectively as in traditional keyword search, while satisfying additional constraints. Several XML-retrie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Consens, M.P., Xin Gu, Kanza, Y., Rizzolo, F.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Retrieval queries that combine structural constraints with keyword search represent a significant challenge to XML data management systems. Queries are expected to be answered as efficiently and effectively as in traditional keyword search, while satisfying additional constraints. Several XML-retrieval systems support answering queries exhaustively by storing both structural indexes and a keyword index. Other systems answer top-k queries efficiently by constructing indexes in which keyword scores, for some structural elements, are stored in relevance order, enabling approaches such as the threshold algorithm (TA). In this paper we describe TReX an XML retrieval system that can exploit multiple structural summaries (including newly defined ones). TReX can also self-manage small, redundant, indexes to speed up the evaluation of workloads of top-k queries. The redundant indexes are maintained to enable TReX to select an evaluation strategies among three (and potentially more) retrieval methods. We provide experimental evidence that using several strategies improves the efficiency of query evaluation, since none of the retrieval methods outperforms the others in all cases.
DOI:10.1109/ICDEW.2007.4400999