Entropy-based clustering for improving document re-ranking

Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of betwee...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chong Teng, Yanxiang He, Donghong Ji, Cheng zhou, Yixuan Geng, Shu Chen
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	between-cluster entropy Clustering component Concrete Document re-ranking Entropy Helium Information retrieval Large-scale systems Mathematics Statistics Text analysis Thesauri Vocabulary within-cluster entropy
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.
DOI:	10.1109/ICICISYS.2009.5358089