An analysis of “A feature reduced intrusion detection system using ANN classifier” by Akashdeep et al. expert systems with applications (2017)

•Test dataset can never be modified.•Optimal Training dataset composition for detection of minority classes.•Feature selection without under-sampling performs better.•Under-sampling Normal class instances is more fruitful compared to oversampling U2R and R2L category of attacks.•Naïve Bayes Classifi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2019-09, Vol.130, p.79-83
Hauptverfasser:	Chandak, Trupti, Shukla, Sanyam, Wadhvani, Rajesh
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bayesian analysis Classification Classifiers Datasets Expert systems Feature selection Intrusion Detection System (IDS) Intrusion detection systems Oversampling Performance evaluation Training Training-Test dataset composition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Test dataset can never be modified.•Optimal Training dataset composition for detection of minority classes.•Feature selection without under-sampling performs better.•Under-sampling Normal class instances is more fruitful compared to oversampling U2R and R2L category of attacks.•Naïve Bayes Classifier has good detection rate for U2R and Probe category of attacks and can be utilized for the same in a multistage classifier. This paper analyses the recently proposed article “A feature reduced intrusion detection system using ANN classifier” by Akashdeep, Ishfaq Manzoor & Neeraj Kumar, (Expert systems with Applications, 2017) which has a limitation in its experimental setup. The work of Akashdeep et.al has crafted the test dataset to attain improved accuracy. They have utilized 5 fractional test datasets for performance evaluation. The reduced list of features obtained in their work does not give the asserted performance for the original test dataset. Table 18 of the above article by Akashdeep et.al gives the performance comparison of their work with existing works which isn’t appropriate as these works have different test dataset composition. Another issue with the work of Akashdeep et.al is the utilization of partial training dataset for determining the reduced list of features. Their work reduces the training dataset by random undersampling of the majority class instances and random replication of the minority class instances. The reduced list of features obtained by Akashdeep et.al comprises 25 features. This work applies the feature selection algorithm proposed by Akashdeep et.al on the original training dataset leading to a feature subset having 29 features. It has been observed experimentally that the reduced feature subset (29 features) obtained in later outperforms the former reduced feature set (25 features). This work uses the classification algorithms c4.5, Naive Bayes and Random Forest for performance comparison of these reduced feature sets. Oversampling one class may deteriorate the performance of another class. This work also evaluates random undersampling/oversampling of a specific class to design an optimal training dataset. The results show that the classification models developed using this training dataset have a better detection rate for the minority classes.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2019.04.017