Word-embedding-based pseudo-relevance feedback for Arabic information retrieval

Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant doc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of information science 2019-08, Vol.45 (4), p.429-442
Hauptverfasser: El Mahdaouy, Abdelkader, El Alaoui, Saïd Ouatik, Gaussier, Eric
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.
ISSN:0165-5515
1741-6485
DOI:10.1177/0165551518792210