Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
Cybersecurity has emerged as a critical global concern. Intrusion Detection Systems (IDS) play a critical role in protecting interconnected networks by detecting malicious actors and activities. Machine Learning (ML)-based behavior analysis within the IDS has considerable potential for detecting dyn...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cybersecurity has emerged as a critical global concern. Intrusion Detection
Systems (IDS) play a critical role in protecting interconnected networks by
detecting malicious actors and activities. Machine Learning (ML)-based behavior
analysis within the IDS has considerable potential for detecting dynamic cyber
threats, identifying abnormalities, and identifying malicious conduct within
the network. However, as the number of data grows, dimension reduction becomes
an increasingly difficult task when training ML models. Addressing this, our
paper introduces a novel ML-based network intrusion detection model that uses
Random Oversampling (RO) to address data imbalance and Stacking Feature
Embedding based on clustering results, as well as Principal Component Analysis
(PCA) for dimension reduction and is specifically designed for large and
imbalanced datasets. This model's performance is carefully evaluated using
three cutting-edge benchmark datasets: UNSW-NB15, CIC-IDS-2017, and
CIC-IDS-2018. On the UNSW-NB15 dataset, our trials show that the RF and ET
models achieve accuracy rates of 99.59% and 99.95%, respectively. Furthermore,
using the CIC-IDS2017 dataset, DT, RF, and ET models reach 99.99% accuracy,
while DT and RF models obtain 99.94% accuracy on CIC-IDS2018. These performance
results continuously outperform the state-of-art, indicating significant
progress in the field of network intrusion detection. This achievement
demonstrates the efficacy of the suggested methodology, which can be used
practically to accurately monitor and identify network traffic intrusions,
thereby blocking possible threats. |
---|---|
DOI: | 10.48550/arxiv.2401.12262 |