Context-aware Urdu Information Retrieval System

World Wide Web (WWW) is playing a vital role for sharing dynamic knowledge in every field of life. The information on web comprises a huge amount of data in different forms such as structured, semi structured, or few is totally in unstructured format. Due to huge size of information, searching from...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on Asian and low-resource language information processing 2023-04, Vol.22 (3), p.1-19, Article 70
Hauptverfasser:	Shoaib, Umar, Fiaz, Laiba, Chakraborty, Chinmay, Rauf, Hafiz Tayyab
Format:	Artikel
Sprache:	eng
Schlagworte:	Information extraction Information systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	World Wide Web (WWW) is playing a vital role for sharing dynamic knowledge in every field of life. The information on web comprises a huge amount of data in different forms such as structured, semi structured, or few is totally in unstructured format. Due to huge size of information, searching from larger textual data about the specific topic or getting precise information is a challenging task. All this leads to the problem of word sense ambiguity (WSA). Urdu language-based information retrieval system using different techniques related to Web Semantic Search Engine architecture is proposed to efficiently retrieve the relevant information and solve the problem of WSA. The proposed system has average precision ratio 96% as compared to average precision ratio of 74% and 75% average precision Google for single word query. For the long text queries, our system outperforms the existing famous search engines with 92% accuracy such as Bing and Google having 16.50% and 16% accuracy, respectively. Similarly, the proposed system for single word query, the recall ratio is 32.25% as compared to 25% and 25% of Bing and Google. The results of recall ratio for long text query are improved as well, showing 6.38% as compared to 6.20% and 4.8% of Bing and Google, respectively. The results showed that the proposed system gives better and efficient results as compared to the existing systems for Urdu language.
ISSN:	2375-4699 2375-4702
DOI:	10.1145/3502854