Genetic-based approaches in ranking function discovery and optimization in information retrieval — A framework

An Information Retrieval (IR) system consists of document collection, queries issued by users, and the matching/ranking functions used to rank documents in the predicted order of relevance for a given query. A variety of ranking functions have been used in the literature. But studies show that these...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Decision Support Systems 2009-11, Vol.47 (4), p.398-407
Hauptverfasser: Fan, Weiguo, Pathak, Praveen, Zhou, Mi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An Information Retrieval (IR) system consists of document collection, queries issued by users, and the matching/ranking functions used to rank documents in the predicted order of relevance for a given query. A variety of ranking functions have been used in the literature. But studies show that these functions do not perform consistently well across different contexts. In this paper we propose a two-stage integrated framework for discovering and optimizing ranking functions used in IR. The first stage, discovery process, is accomplished by intelligently leveraging the structural and statistical information available in HTML documents by using Genetic Programming techniques to yield novel ranking functions. In the second stage, the optimization process, document retrieval scores of various well-known ranking functions are combined using Genetic Algorithms. The overall discovery and optimization framework is tested on the well-known TREC collection of web documents for both the ad-hoc retrieval task and the routing task. Utilizing our framework we observe a significant increase in retrieval performance compared to some of the well-known stand alone ranking functions.
ISSN:0167-9236
1873-5797
DOI:10.1016/j.dss.2009.04.005