Deep learning-based information retrieval with normalized dominant feature subset and weighted vector model

Multimedia data, which includes textual information, is employed in a variety of practical computer vision applications. More than a million new records are added to social media and news sites every day, and the text content they contain has gotten increasingly complex. Finding a meaningful text re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PeerJ. Computer science 2024-01, Vol.10, p.e1805, Article e1805
Hauptverfasser:	Eswaraiah, Poluru, Syed, Hussain
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Archives & records Artificial Intelligence Big Data Boolean Computational linguistics Computer vision Data mining Data Mining and Machine Learning Datasets Deep learning Feature extraction Feature selection Feature subset Feature vector Image retrieval Information retrieval Information storage and retrieval Language processing Literature reviews Machine learning Machine vision Massive data points Multimedia Natural language Natural Language and Speech Natural language interfaces Natural language processing Neural Networks Open source software Public software Queries Recommender systems Relevance Research methodology Search engines Semantics Text categorization Theory and Formal Methods
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Multimedia data, which includes textual information, is employed in a variety of practical computer vision applications. More than a million new records are added to social media and news sites every day, and the text content they contain has gotten increasingly complex. Finding a meaningful text record in an archive might be challenging for computer vision researchers. Most image searches still employ the tried and true language-based techniques of query text and metadata. Substantial work has been done in the past two decades on content-based text retrieval and analysis that still has limitations. The importance of feature extraction in search engines is often overlooked. Web and product search engines, recommendation systems, and question-answering activities frequently leverage these features. Extracting high-quality machine learning features from large text volumes is a challenge for many open-source software packages. Creating an effective feature set manually is a time-consuming process, but with deep learning, new actual feature demos from training data are analyzed. As a novel feature extraction method, deep learning has made great strides in text mining. Automatically training a deep learning model with the most pertinent text attributes requires massive datasets with millions of variables. In this research, a Normalized Dominant Feature Subset with Weighted Vector Model (NDFS-WVM) is proposed that is used for feature extraction and selection for information retrieval from big data using natural language processing models. The suggested model outperforms the conventional models in terms of text retrieval. The proposed model achieves 98.6% accuracy in information retrieval.
ISSN:	2376-5992 2376-5992
DOI:	10.7717/peerj-cs.1805