An augmented semantic search tool for multilingual news analytics

News feeds generate colossal amount of data consisting of important information hidden in the intricacies. State of the art methods are still at infancy in providing a very generic and publicly available solution to skim through the important information in the news from various sources and an abili...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of intelligent & fuzzy systems 2022-01, Vol.43 (6), p.8315-8327
Hauptverfasser: Harikumar, Sandhya, Sathyajit, Rohit, Karumudi, Gnana Venkata Naga Sai Kalyan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:News feeds generate colossal amount of data consisting of important information hidden in the intricacies. State of the art methods are still at infancy in providing a very generic and publicly available solution to skim through the important information in the news from various sources and an ability to search using specific keywords in different languages. This paper focuses on designing a tool to extract semantic details from news articles published through various internet sources in various languages. The semantic information is stored within DBMS for ease of organizing and retrieving the data. Further, a querying facility to search through entire articles based on the keyword or date-based search is also proposed to view the crisp content. The news articles in English, and two Indian languages - Hindi and Malayalam are considered for experimentation. The proposed strategy consists of two main components namely, Generative model creation and Query engine. Generative model aims to extract important entities and keywords along with their relevance to the article and other similar articles using Latent Dirichlet Allocation(LDA) and Named Entity Recognition(NER). Query engine is to facilitate on the fly retrieval of semantic content from the database, based on user keyword. The search engine, along with database indexing, reduces the access time to the database thereby retrieving the information in less time. Experimental results show that the proposed method is effective in terms of quality of information and time consumed for information retrieval.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-221184