A fine-tuning deep learning with multi-objective-based feature selection approach for the classification of text

Document classification is becoming increasingly essential for the vast number of documents available in digital libraries, emails, the Internet, etc. Textual records frequently contain non-discriminative (noisy and irrelevant) terms that are also high-dimensional, resulting in higher computing cost...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2024-03, Vol.36 (7), p.3525-3553
Hauptverfasser: Dhal, Pradip, Azad, Chandrashekhar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Document classification is becoming increasingly essential for the vast number of documents available in digital libraries, emails, the Internet, etc. Textual records frequently contain non-discriminative (noisy and irrelevant) terms that are also high-dimensional, resulting in higher computing costs and poorer learning performance in Text Classification (TC). Feature selection (FS), which tries to discover discriminate terms or features from the textual data, is one of the most effective tasks for this issue. This paper introduces a novel multi-stage term-weighting scheme-based FS model designed for the single-label TC system to obtain the optimal set of features. We have also developed a hybrid deep learning fine-tuning network based on Bidirectional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN) for the classification stage. The FS approach is worked on two-stage criteria. The filter model is used in the first stage, and the multi-objective wrapper model, an upgraded version of the Whale Optimization Algorithm (WOA) with Particle Swarm Optimization (PSO), is used in the second stage. The objective function in the above wrapper model is based on a tri-objective principle. It uses the Pareto front technique to discover the optimal set of features. Here in the wrapper model, a novel selection strategy has been introduced to select the whale instead of the random whale. The proposed work is evaluated on four popular benchmark text corpora, of which two are binary class, and two are multi-class. The suggested FS technique is compared against classic Machine Learning (ML) and deep learning classifiers. The results of the experiments reveal that the recommended FS technique is more effective in obtaining better results than the other results.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-023-09225-1