Bi-Directional Feature Fixation-Based Particle Swarm Optimization for Large-Scale Feature Selection

Feature selection, which aims to improve the classification accuracy and reduce the size of the selected feature subset, is an important but challenging optimization problem in data mining. Particle swarm optimization (PSO) has shown promising performance in tackling feature selection problems, but...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on big data 2023-06, Vol.9 (3), p.1004-1017
Hauptverfasser: Yang, Jia-Quan, Yang, Qi-Te, Du, Ke-Jing, Chen, Chun-Hua, Wang, Hua, Jeon, Sang-Woon, Zhang, Jun, Zhan, Zhi-Hui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Feature selection, which aims to improve the classification accuracy and reduce the size of the selected feature subset, is an important but challenging optimization problem in data mining. Particle swarm optimization (PSO) has shown promising performance in tackling feature selection problems, but still faces challenges in dealing with large-scale feature selection in Big Data environment because of the large search space. Hence, this article proposes a bi-directional feature fixation (BDFF) framework for PSO and provides a novel idea to reduce the search space in large-scale feature selection. BDFF uses two opposite search directions to guide particles to adequately search for feature subsets with different sizes. Based on the two different search directions, BDFF can fix the selection states of some features and then focus on the others when updating particles, thus narrowing the large search space. Besides, a self-adaptive strategy is designed to help the swarm concentrate on a more promising direction for search in different stages of evolution and achieve a balance between exploration and exploitation. Experimental results on 12 widely-used public datasets show that BDFF can improve the performance of PSO on large-scale feature selection and obtain smaller feature subsets with higher classification accuracy.
ISSN:2332-7790
2372-2096
DOI:10.1109/TBDATA.2022.3232761