Tri-staged feature selection in multi-class heterogeneous datasets using memetic algorithm and cuckoo search optimization
•Proposes Tri-Staged Feature Selection (TFS) for multi-class heterogeneous datasets.•Initial features are selected using Kruskal Wallis Test.•Refinement of obtained features using Memetic Algorithm with local beam search.•Final feature set refinement using Cuckoo search algorithm for better classifi...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2022-12, Vol.209, p.118286, Article 118286 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Proposes Tri-Staged Feature Selection (TFS) for multi-class heterogeneous datasets.•Initial features are selected using Kruskal Wallis Test.•Refinement of obtained features using Memetic Algorithm with local beam search.•Final feature set refinement using Cuckoo search algorithm for better classification.•Experiments conducted on 12 real datasets for validation of proposed method.
Classification algorithms and their preprocessing operations usually performs on feature selection on homogeneous or heterogeneous attributes, binary or multi-class labels separately. Only very few methods attempt to perform feature selection on datasets with heterogeneous multi-class attributes. In order to bridge this gap with better classification performance, the paper proposes a Tri-staged Feature Selection (TFS) methodology which performs (i) Feature selection using Kruskal Wallis test (ii) Refinement of feature selection using a new Memetic Algorithm with local beam search and genetic algorithm operations and (iii) Further refinement of feature selection using Cuckoo Search algorithm. Proper tradeoff between both exploration and exploitation is maintained in the proposed method. The experimental results on 12 datasets show that the proposed method is better than that of state-of-the-art methods used for feature selection in terms of multi-class accuracy, hamming loss, ranking loss, normalized coverage and convergence rate for multi-class heterogeneous datasets. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2022.118286 |