A comprehensive comparison study of ML models for multistage APT detection: focus on data preprocessing and resampling
Advanced persistent threats (APTs) present a significant cybersecurity challenge, necessitating innovative detection methods. This study stands out by integrating advanced data preparation with strategies for handling data imbalances, tailored for the SCVIC-APT-2021 dataset. We employ a mix of resam...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2024, Vol.80 (10), p.14143-14179 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Advanced persistent threats (APTs) present a significant cybersecurity challenge, necessitating innovative detection methods. This study stands out by integrating advanced data preparation with strategies for handling data imbalances, tailored for the SCVIC-APT-2021 dataset. We employ a mix of resampling, cost-sensitive learning, and ensemble methods, alongside machine learning and deep learning models like XGBoost, LightGBM, and ANNs, to enhance APT detection. Our strategy, which draws from the MITRE ATT&CK framework, concentrates on each stage of APT attacks, which significantly increases detection accuracy. Notably, we achieved a Macro F1-score of 95.20% with XGBoost and 96.67% with LightGBM, and significant enhancements in the area under the precision–recall curve for both. Our study’s exploration of the SCVIC-APT-2021 dataset marks a progressive step in APT detection research, with vital implications for future cybersecurity developments. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-024-06010-2 |