Selective Cluster Presentation on the Search Results Page

Web search engines present, for some queries, a cluster of results from the same specialized domain (“vertical”) on the search results page (SERP). We introduce a comprehensive analysis of the presentation of such clusters from seven different verticals based on the logs of a commercial Web search e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on information systems 2018-07, Vol.36 (3), p.1-42
Hauptverfasser: Levi, Or, Guy, Ido, Raiber, Fiana, Kurland, Oren
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Web search engines present, for some queries, a cluster of results from the same specialized domain (“vertical”) on the search results page (SERP). We introduce a comprehensive analysis of the presentation of such clusters from seven different verticals based on the logs of a commercial Web search engine. This analysis reveals several unique characteristics—such as size, rank, and clicks—of result clusters from community question-and-answer websites. The study of properties of this result cluster—specifically as part of the SERP—has received little attention in previous work. Our analysis also motivates the pursuit of a long-standing challenge in ad hoc retrieval, namely, selective cluster retrieval . In our setting, the specific challenge is to select for presentation the documents most highly ranked either by a cluster-based approach (those in the top-retrieved cluster) or by a document-based approach. We address this classification task by representing queries with features based on those utilized for ranking the clusters, query-performance predictors, and properties of the document-clustering structure. Empirical evaluation performed with TREC data shows that our approach outperforms a recently proposed state-of-the-art cluster-based document-retrieval method as well as state-of-the-art document-retrieval methods that do not account for inter-document similarities.
ISSN:1046-8188
1558-2868
DOI:10.1145/3158672