DivDB: a system for diversifying query results

With the availability of very large databases, an exploratory query can easily lead to a vast answer set, typically based on an answer's relevance (i.e., top-k, tf-idf ) to the user query. Navigating through such an answer set requires huge effort and users give up after perusing through the fi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2011-08, Vol.4 (12), p.1395-1398
Hauptverfasser: Vieira, Marcos R., Razente, Humberto L., Barioni, Maria C. N., Hadjieleftheriou, Marios, Srivastava, Divesh, Traina, Caetano, Tsotras, Vassilis J.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the availability of very large databases, an exploratory query can easily lead to a vast answer set, typically based on an answer's relevance (i.e., top-k, tf-idf ) to the user query. Navigating through such an answer set requires huge effort and users give up after perusing through the first few answers, thus some interesting answers hidden further down the answer set can easily be missed. An approach to address this problem is to present the user with the most diverse among the answers based on some diversity criterion. In this demonstration we present DivDB, a system we built to provide query result diversification both for advanced and novice users. For the experienced users, who may want to test the performance of existing and new algorithms, we provide an SQL-based extension to formulate queries with diversification. As for the novice users, who may be more interested in the result rather than how to tune the various algorithms' parameters, the DivDB system allows the user to provide a "hint" to the optimizer on speed vs. quality of result. Moreover, novice users can use an interface to dynamically change the tradeoff value between relevance and diversity in the result, and thus visually inspect the result as they interact with this parameter. This is a great feature to the end user because finding a good tradeoff value is a very hard task and it depends on several variables (i.e., query parameters, evaluation algorithms, and dataset properties). In this demonstration we show a study of the DivDB system with two image databases that contain many images of the same object under different settings (e.g., different camera angle). We show how the DivDB helps users to iteratively inspect diversification in the query result, without the need to know how to tune the many different parameters of the several existing algorithms in the DivDB system.
ISSN:2150-8097
2150-8097
DOI:10.14778/3402755.3402779