A vulnerability severity prediction method based on bimodal data and multi-task learning

•A new vulnerability severity prediction method is proposed to improve the F1 score.•The GraphCodeBert is used to provide comprehensive information for prediction.•Multi-task learning is used to enhance the generalization ability of our model.•Our method outperforms the state-of-the-art method with...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of systems and software 2024-07, Vol.213, p.112039, Article 112039
Hauptverfasser:	Du, Xiaozhi, Zhang, Shiming, Zhou, Yanrong, Du, Hongyuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Bimodal data GraphCodeBert Multi-task learning Vulnerability severity prediction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•A new vulnerability severity prediction method is proposed to improve the F1 score.•The GraphCodeBert is used to provide comprehensive information for prediction.•Multi-task learning is used to enhance the generalization ability of our model.•Our method outperforms the state-of-the-art method with an F1 score of 93.83 %. Facing the increasing number of software vulnerabilities, the automatic analysis of vulnerabilities has become an important task in the field of software security. However, the existing severity prediction methods are mainly based on vulnerability descriptions and ignore the relevant features of vulnerability code, which only includes unimodal information and result in low prediction accuracy. This paper proposes a vulnerability severity prediction method based on bimodal data and multi-task learning. First the bimodal data, which consists of the description and source code of each vulnerability, is preprocessed. Next the GraphCodeBert is used for the word embedding module to extract different vulnerability features from the bimodal data. Then the Bi-GRU with attention mechanism is adopted for further feature extraction of vulnerability severity. Considering the strong correlation between the two tasks of vulnerability severity prediction and exploitability prediction, this paper proposes a multi-task learning approach, which allows the model to learn the connection and shared information between different tasks through a hard parameter sharing strategy, so as to achieve more accurate and reliable prediction of vulnerability severity. Experimental results show that the severity prediction method proposed in this paper outperforms state-of-the-art methods, and can achieve an average F1 score of 93.83 % on the public vulnerability dataset.
ISSN:	0164-1212 1873-1228
DOI:	10.1016/j.jss.2024.112039