Better together: Comparing vulnerability prediction models

Vulnerability Prediction Models (VPMs) are an approach for prioritizing security inspection and testing to find and fix vulnerabilities. VPMs have been created based on a variety of metrics and approaches, yet widespread adoption of VPM usage in practice has not occurred. Knowing which VPMs have str...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information and software technology 2020-03, Vol.119, p.106204, Article 106204
Hauptverfasser: Theisen, Christopher, Williams, Laurie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Vulnerability Prediction Models (VPMs) are an approach for prioritizing security inspection and testing to find and fix vulnerabilities. VPMs have been created based on a variety of metrics and approaches, yet widespread adoption of VPM usage in practice has not occurred. Knowing which VPMs have strong prediction and which VPMs have low data requirements and resources usage would be useful for practitioners to match VPMs to their project’s needs. The low density of vulnerabilities compared to defects is also an obstacle for practical VPMs. The goal of the paper is to help security practitioners and researchers choose appropriate features for vulnerability prediction through a comparison of Vulnerability Prediction Models. We performed replications of VPMs on Mozilla Firefox with 28,750 source code files featuring 271 vulnerabilities using software metrics, text mining, and crash data. We then combined features from each VPM and reran our classifiers. We improved the F-score of the best VPM (.20 to 0.28) by combining features from three types of VPMs and using Naive Bayes as the classifier. The strongest features in the combined model were the number of times a file was involved in a crash, the number of outgoing calls from a file, and the string “nullptr”. Our results indicate that further work is needed to develop new features for input into classifiers. In addition, new analytic approaches for VPMs are needed for VPMs to be useful in practical situations, due to the low density of vulnerabilities in software (less than 1% for our dataset).
ISSN:0950-5849
1873-6025
DOI:10.1016/j.infsof.2019.106204