Malware Control Flow Graphs Dataset

Detecting malware from binary files is an important task in the research and development of the fields of cybersecurity and machine learning. This dataset was curated to assess the viability of unsupervised machine learning clustering techniques to identify differences between graph representations...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Liyanage, Kaveen Liyanage, Pearsall, Reese Pearsall, Izurieta, Clemente Izurieta, Whitaker, Bradley Whitaker
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Detecting malware from binary files is an important task in the research and development of the fields of cybersecurity and machine learning. This dataset was curated to assess the viability of unsupervised machine learning clustering techniques to identify differences between graph representations of benign software and malware. The dataset contains Control Flow Graphs benign and malicious programs. The dataset consists of benign operating system files as well as malware provided by Hoplite industries(https://www.hopliteindustries.com). The binary files are first converted into a control flow graph (CFG) representation. This is carried out by the CFGEmulated() function of the "angr" Python library. Due to the security risk of sharing Malware binary files online. These binary files are not shared. However, the hash values of the binaries are given for information. Dataset is permanently archived at the following link.'https://doi.org/10.5281/zenodo.7630371'
DOI:10.21227/31pa-7837