Word-embedding-based pseudo-relevance feedback for Arabic information retrieval
Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant doc...
Gespeichert in:
Veröffentlicht in: | Journal of information science 2019-08, Vol.45 (4), p.429-442 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 442 |
---|---|
container_issue | 4 |
container_start_page | 429 |
container_title | Journal of information science |
container_volume | 45 |
creator | El Mahdaouy, Abdelkader El Alaoui, Saïd Ouatik Gaussier, Eric |
description | Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively. |
doi_str_mv | 10.1177/0165551518792210 |
format | Article |
fullrecord | <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_02132288v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551518792210</sage_id><sourcerecordid>2252079434</sourcerecordid><originalsourceid>FETCH-LOGICAL-c462t-17963a356cbbd9ef478f7653b7d12fa2eb7b7f09ff1acf380493eef166b1cd4c3</originalsourceid><addsrcrecordid>eNp1kM1Lw0AQxRdRsH7cPQY8eVjd2Y9scixFrVDoRfEY9mO2prZJ3U0L_vcmRBQET8PM-73H8Ai5AnYLoPUdg1wpBQoKXXIO7IhMQEuguSzUMZkMMh30U3KW0poxpkohJ2T52kZPcWvR-7pZUWsS-myXcO9bGnGDB9M4zAKit8a9Z6GN2TQaW7usbvpla7q6bbKIXax7dnNBToLZJLz8nufk5eH-eTani-Xj02y6oE7mvKOgy1wYoXJnrS8xSF0EnSthtQceDEerrQ6sDAGMC6JgshSIAfLcgvPSiXNyM-a-mU21i_XWxM-qNXU1ny6q4cY4CM6L4gA9ez2yu9h-7DF11brdx6Z_r-JccaZLKWRPsZFysU0pYviJBVYNFVd_K-4tdLQks8Lf0H_5L0_0eqs</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2252079434</pqid></control><display><type>article</type><title>Word-embedding-based pseudo-relevance feedback for Arabic information retrieval</title><source>SAGE Complete A-Z List</source><creator>El Mahdaouy, Abdelkader ; El Alaoui, Saïd Ouatik ; Gaussier, Eric</creator><creatorcontrib>El Mahdaouy, Abdelkader ; El Alaoui, Saïd Ouatik ; Gaussier, Eric</creatorcontrib><description>Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551518792210</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Computer Science ; Embedding ; Feedback ; Information Retrieval ; Queries ; Query expansion ; Relevance feedback ; Similarity</subject><ispartof>Journal of information science, 2019-08, Vol.45 (4), p.429-442</ispartof><rights>The Author(s) 2018</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c462t-17963a356cbbd9ef478f7653b7d12fa2eb7b7f09ff1acf380493eef166b1cd4c3</citedby><cites>FETCH-LOGICAL-c462t-17963a356cbbd9ef478f7653b7d12fa2eb7b7f09ff1acf380493eef166b1cd4c3</cites><orcidid>0000-0002-8858-3233 ; 0000-0003-4281-2472</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/0165551518792210$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/0165551518792210$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>230,314,777,781,882,21800,27905,27906,43602,43603</link.rule.ids><backlink>$$Uhttps://hal.science/hal-02132288$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>El Mahdaouy, Abdelkader</creatorcontrib><creatorcontrib>El Alaoui, Saïd Ouatik</creatorcontrib><creatorcontrib>Gaussier, Eric</creatorcontrib><title>Word-embedding-based pseudo-relevance feedback for Arabic information retrieval</title><title>Journal of information science</title><description>Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.</description><subject>Computer Science</subject><subject>Embedding</subject><subject>Feedback</subject><subject>Information Retrieval</subject><subject>Queries</subject><subject>Query expansion</subject><subject>Relevance feedback</subject><subject>Similarity</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp1kM1Lw0AQxRdRsH7cPQY8eVjd2Y9scixFrVDoRfEY9mO2prZJ3U0L_vcmRBQET8PM-73H8Ai5AnYLoPUdg1wpBQoKXXIO7IhMQEuguSzUMZkMMh30U3KW0poxpkohJ2T52kZPcWvR-7pZUWsS-myXcO9bGnGDB9M4zAKit8a9Z6GN2TQaW7usbvpla7q6bbKIXax7dnNBToLZJLz8nufk5eH-eTani-Xj02y6oE7mvKOgy1wYoXJnrS8xSF0EnSthtQceDEerrQ6sDAGMC6JgshSIAfLcgvPSiXNyM-a-mU21i_XWxM-qNXU1ny6q4cY4CM6L4gA9ez2yu9h-7DF11brdx6Z_r-JccaZLKWRPsZFysU0pYviJBVYNFVd_K-4tdLQks8Lf0H_5L0_0eqs</recordid><startdate>20190801</startdate><enddate>20190801</enddate><creator>El Mahdaouy, Abdelkader</creator><creator>El Alaoui, Saïd Ouatik</creator><creator>Gaussier, Eric</creator><general>SAGE Publications</general><general>Bowker-Saur Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0002-8858-3233</orcidid><orcidid>https://orcid.org/0000-0003-4281-2472</orcidid></search><sort><creationdate>20190801</creationdate><title>Word-embedding-based pseudo-relevance feedback for Arabic information retrieval</title><author>El Mahdaouy, Abdelkader ; El Alaoui, Saïd Ouatik ; Gaussier, Eric</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c462t-17963a356cbbd9ef478f7653b7d12fa2eb7b7f09ff1acf380493eef166b1cd4c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science</topic><topic>Embedding</topic><topic>Feedback</topic><topic>Information Retrieval</topic><topic>Queries</topic><topic>Query expansion</topic><topic>Relevance feedback</topic><topic>Similarity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>El Mahdaouy, Abdelkader</creatorcontrib><creatorcontrib>El Alaoui, Saïd Ouatik</creatorcontrib><creatorcontrib>Gaussier, Eric</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>El Mahdaouy, Abdelkader</au><au>El Alaoui, Saïd Ouatik</au><au>Gaussier, Eric</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Word-embedding-based pseudo-relevance feedback for Arabic information retrieval</atitle><jtitle>Journal of information science</jtitle><date>2019-08-01</date><risdate>2019</risdate><volume>45</volume><issue>4</issue><spage>429</spage><epage>442</epage><pages>429-442</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><abstract>Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/0165551518792210</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-8858-3233</orcidid><orcidid>https://orcid.org/0000-0003-4281-2472</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0165-5515 |
ispartof | Journal of information science, 2019-08, Vol.45 (4), p.429-442 |
issn | 0165-5515 1741-6485 |
language | eng |
recordid | cdi_hal_primary_oai_HAL_hal_02132288v1 |
source | SAGE Complete A-Z List |
subjects | Computer Science Embedding Feedback Information Retrieval Queries Query expansion Relevance feedback Similarity |
title | Word-embedding-based pseudo-relevance feedback for Arabic information retrieval |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T07%3A10%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Word-embedding-based%20pseudo-relevance%20feedback%20for%20Arabic%20information%20retrieval&rft.jtitle=Journal%20of%20information%20science&rft.au=El%20Mahdaouy,%20Abdelkader&rft.date=2019-08-01&rft.volume=45&rft.issue=4&rft.spage=429&rft.epage=442&rft.pages=429-442&rft.issn=0165-5515&rft.eissn=1741-6485&rft_id=info:doi/10.1177/0165551518792210&rft_dat=%3Cproquest_hal_p%3E2252079434%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2252079434&rft_id=info:pmid/&rft_sage_id=10.1177_0165551518792210&rfr_iscdi=true |