Query expansion based on term selection for Hindi – English cross lingual IR
Retrieving accurate information from collection of information available on web in a cross-lingual communication environment is a very difficult task in our world. In order to retrieve information, user specifies the needed information in the form of query. Sometimes query may not be able to express...
Gespeichert in:
Veröffentlicht in: | Journal of King Saud University. Computer and information sciences 2020-03, Vol.32 (3), p.310-319 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Retrieving accurate information from collection of information available on web in a cross-lingual communication environment is a very difficult task in our world. In order to retrieve information, user specifies the needed information in the form of query. Sometimes query may not be able to express the needed information in specific way due to ambiguity or un-translated query words. This problem can be minimized by expanding the query with other suitable words that make it more specific. Purpose of query expansion is to improve the performance and quality of retrieved information in CLIR. In this paper, Q.E. has been explored for a Hindi-English CLIR in which Hindi queries are used to search English documents. We used Okapi BM25 for documents ranking and then by using Term Selection Value (TSV) translated queries have been expanded. All experiments have been performed on FIRE 2012 dataset by analysing the impact of occurrence of terms in top @3 ranked documents. Our result shows that the relevancy of retrieved results of Hindi-English CLIR using Q.E. which is performed by adding a lowest frequency term from the corpus of top @3 ranked documents is 51.33%, which is higher than before and after Q.E. (i.e. Case1, Case2). |
---|---|
ISSN: | 1319-1578 2213-1248 |
DOI: | 10.1016/j.jksuci.2017.09.002 |