A Pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval

•Relevance matching plays a more important role than semantic matching in information retrieval.•The proposed framework, which combines relevance matching and semantic matching, is more effective than using either relevance matching or semantic matching.•Five enhanced models are generated by merging...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information processing & management 2020-11, Vol.57 (6), p.102342, Article 102342
Hauptverfasser: Wang, Junmei, Pan, Min, He, Tingting, Huang, Xiang, Wang, Xueyan, Tu, Xinhui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Relevance matching plays a more important role than semantic matching in information retrieval.•The proposed framework, which combines relevance matching and semantic matching, is more effective than using either relevance matching or semantic matching.•Five enhanced models are generated by merging the framework with probability-based PRF models and language-model-based PRF models.•Our PRF framework combines relevance matching and semantic matching to improve the quality of the feedback documents. Pseudo-relevance feedback (PRF) is a well-known method for addressing the mismatch between query intention and query representation. Most current PRF methods consider relevance matching only from the perspective of terms used to sort feedback documents, thus possibly leading to a semantic gap between query representation and document representation. In this work, a PRF framework that combines relevance matching and semantic matching is proposed to improve the quality of feedback documents. Specifically, in the first round of retrieval, we propose a reranking mechanism in which the information of the exact terms and the semantic similarity between the query and document representations are calculated by bidirectional encoder representations from transformers (BERT); this mechanism reduces the text semantic gap by using the semantic information and improves the quality of feedback documents. Then, our proposed PRF framework is constructed to process the results of the first round of retrieval by using probability-based PRF methods and language-model-based PRF methods. Finally, we conduct extensive experiments on four Text Retrieval Conference (TREC) datasets. The results show that the proposed models outperform the robust baseline models in terms of the mean average precision (MAP) and precision P at position 10 (P@10), and the results also highlight that using the combined relevance matching and semantic matching method is more effective than using relevance matching or semantic matching alone in terms of improving the quality of feedback documents.
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2020.102342