An effective genetic algorithm-based feature selection method for intrusion detection systems

Availability of suitable and validated data is a key issue in multiple domains for implementing machine learning methods. Higher data dimensionality has adverse effects on the learning algorithm's performance. This work aims to design a method that preserves most of the unique information relat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & security 2021-11, Vol.110, p.102448, Article 102448
Hauptverfasser: Halim, Zahid, Yousaf, Muhammad Nadeem, Waqas, Muhammad, Sulaiman, Muhammad, Abbas, Ghulam, Hussain, Masroor, Ahmad, Iftekhar, Hanif, Muhammad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Availability of suitable and validated data is a key issue in multiple domains for implementing machine learning methods. Higher data dimensionality has adverse effects on the learning algorithm's performance. This work aims to design a method that preserves most of the unique information related to the data with minimum number of features. Addressing the feature selection problem in the domain of network security and intrusion detection, this work contributes an enhanced Genetic Algorithm (GA)-based feature selection method, named as GA-based Feature Selection (GbFS), to increase the classifiers’ accuracy. Securing a network from the cyber-attacks is a critical task and needs to be strengthened. Machine learning, due to its proven results, is widely used in developing firewalls and Intrusion Detection Systems (IDSs) to identify new kinds of attacks. Utilizing machine learning algorithms, IDSs are able to detect the intruder by analyzing the network traffic passing through it. This work presents parameter tuning for the GA-based feature selection along with a novel fitness function. The present work develops an enhanced GA-based feature selection method which is tested over three benchmark network traffic datasets, namely, CIRA-CIC-DOHBrw-2020, UNSW-NB15, and Bot-IoT. A comparison is also performed with the standard feature selection methods. Results show that the accuracies improve using GbFS by achieving a maximum accuracy of 99.80%.
ISSN:0167-4048
1872-6208
DOI:10.1016/j.cose.2021.102448