Machine Learning Approaches to Malicious PowerShell Scripts Detection and Feature Combination Analysis

With advances in communication technology, modern society relies more than ever on the Internet and various user-friendly digital tools. It provides access to and enables the manipulation of files, trips, and the Windows API. Attackers frequently use various obfuscation techniques PowerShell scripts...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Wangji Wanglu Jishu Xuekan = Journal of Internet Technology 2024-01, Vol.25 (1), p.167-173
Hauptverfasser: Hung, Hsiang-Hua, Chen, Jiann-Liang, Ma, Yi-Wei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With advances in communication technology, modern society relies more than ever on the Internet and various user-friendly digital tools. It provides access to and enables the manipulation of files, trips, and the Windows API. Attackers frequently use various obfuscation techniques PowerShell scripts to avoid detection by anti-virus software. Their doing so can significantly reduce the readability of the script. This work statically analyzes PowerShell scripts. Thirty-three features that were based on the script's keywords, format, and string combinations were used herein to determine the behavioral intent of the script. Ones are characteristic-based features that are obtained by calculation; the others are behavior-based features that determine the execution function of behavior using keywords and instructions. Behavior-based features can be divided into positive behavior-based features, neutral behavior-based features, and negative behavior-based features. These three types of features are enhanced by observing samples and adding keywords. The other type of characteristic-based feature is introduced into the formula from other studies in this work. The XGBoost model was used to evaluate the importance of the features that are proposed in this study and to identify the combination of features that contributed most to the detection of PowerShell scripts. The final model with the combined features is found to exhibit the best performance. The model has 99.27% accuracy when applied to the validation dataset. The results clearly indicate that the proposed malicious PowerShell script detection model outperforms previously developed models.
ISSN:1607-9264
1607-9264
2079-4029
DOI:10.53106/160792642024012501014