Ransomware detection based on machine learning using memory features

Ransomware attacks have escalated recently and are affecting essential infrastructure and enterprises across the globe. Unfortunately,ransomware uses sophisticated encryption techniques to encrypt important files on the targeted machine and then demands payment to decrypt the data. Artificial intell...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Egyptian informatics journal 2024-03, Vol.25, p.100445, Article 100445
Hauptverfasser: Aljabri, Malak, Alhaidari, Fahd, Albuainain, Aminah, Alrashidi, Samiyah, Alansari, Jana, Alqahtani, Wasmiyah, Alshaya, Jana
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ransomware attacks have escalated recently and are affecting essential infrastructure and enterprises across the globe. Unfortunately,ransomware uses sophisticated encryption techniques to encrypt important files on the targeted machine and then demands payment to decrypt the data. Artificial intelligent techniques including machine learning have been increasingly applied in the field of cybersecurity and greatly contributed to detecting and preventing different kinds of attacks However, the number of studies that applied machine learning to detect ransomware are still limited by the obfuscation of malware, the lack of setting up a proper analysis environment, the accuracy of models, and the high false-positive rate. Thus, it is crucial to develop effective ransomware detection based on machine learning techniques. This study aims to build a robust machine-learning model that can recognize unknown samples using memory dumps to detect ransomware with high accuracy and minimal false positives providing an extensive analysis of how memory traces can assist in the detection of ransomware. This goal was achieved by building a new dataset composed of recent ransomware group attack samples like Revil, Lockbit, and BlackCat, as well as a number of benign samples, including office applications, Windows applications, and compression applications, which were dynamically analyzed within an enhanced cuckoo sandbox to ensure the most reliable results. Then, a set of machine learning models were developed, and a comparative performance analysis was conducted. Among the various models evaluated, XGBoost was the best-performing model, using only 47 features out of 58. It achieved 97.85% accuracy with a 2% false positive rate.
ISSN:1110-8665
2090-4754
DOI:10.1016/j.eij.2024.100445