A Constrained Competitive Swarm Optimizer With an SVM-Based Surrogate Model for Feature Selection

Feature selection (FS) is an important data preprocessing technique that selects a small subset of relevant features to improve learning performance. However, it is also challenging due to its large search space. Recently, a competitive swarm optimizer (CSO) has shown promising results in FS because...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on evolutionary computation 2024-02, Vol.28 (1), p.2-16
Hauptverfasser: Nguyen, Bach Hoai, Xue, Bing, Zhang, Mengjie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Feature selection (FS) is an important data preprocessing technique that selects a small subset of relevant features to improve learning performance. However, it is also challenging due to its large search space. Recently, a competitive swarm optimizer (CSO) has shown promising results in FS because of its potential global search ability. The main idea of CSO is to select two solutions randomly and then let the loser (worse fitness) learn from the winner (better fitness). Although such a search mechanism provides a high population diversity, it is at risk of generating unqualified solutions since the winner's quality is not guaranteed. In this work, we propose a constrained evolutionary mechanism for CSO, which verifies the quality of all the particles and lets the infeasible (unqualified) solutions learn from the feasible (qualified) ones. We also propose a novel local search and a size-change operator that guide the population to search for smaller feature subsets with similar or better classification performance. A surrogate model, based on support vector machines, is proposed to assist both local search and the size-change operator to explore a massive number of potential feature subsets without requiring excessive computational resource. Results on 24 real-world datasets show that the proposed algorithm can select smaller feature subsets with higher classification performance than state-of-the-art evolutionary computation (EC) and non-EC benchmark algorithms.
ISSN:1089-778X
1941-0026
DOI:10.1109/TEVC.2022.3197427