Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection

A well-known curse of computer security research is that it often produces systems that, while technically sound, fail operationally. To overcome this curse, the community generally seeks to assess proposed systems under a variety of settings in order to make explicit every potential bias. In this r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Empirical software engineering : an international journal 2021-07, Vol.26 (4), Article 74
Hauptverfasser: Daoudi, Nadia, Allix, Kevin, Bissyandé, Tegawendé F., Klein, Jacques
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A well-known curse of computer security research is that it often produces systems that, while technically sound, fail operationally. To overcome this curse, the community generally seeks to assess proposed systems under a variety of settings in order to make explicit every potential bias. In this respect, recently, research achievements on machine learning based malware detection are being considered for thorough evaluation by the community. Such an effort of comprehensive evaluation supposes first and foremost the possibility to perform an independent reproduction study in order to sharpen evaluations presented by approaches’ authors. The question Can published approaches actually be reproduced? thus becomes paramount despite the little interest such mundane and practical aspects seem to attract in the malware detection field. In this paper, we attempt a complete reproduction of five Android Malware Detectors from the literature and discuss to what extent they are “reproducible”. Notably, we provide insights on the implications around the guesswork that may be required to finalise a working implementation. Finally, we discuss how barriers to reproduction could be lifted, and how the malware detection field would benefit from stronger reproducibility standards—like many various fields already have.
ISSN:1382-3256
1573-7616
DOI:10.1007/s10664-021-09955-7