Cross-site scripting detection with two-channel feature fusion embedded in self-attention mechanism

In the era of big data, stealing users’ private data has become one of the main targets of network hackers. In recent years, cross-site scripting (XSS) attacks to obtain users’ privacy data have been one of the main web attack methods of network hackers. Traditional antivirus software cannot identif...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computers & security 2023-01, Vol.124, p.102990, Article 102990
Hauptverfasser:	Hu, Tianle, Xu, Chonghai, Zhang, Shenwen, Tao, Shuangshuang, Li, Luqun
Format:	Artikel
Sprache:	eng
Schlagworte:	Bidirectional long-short term memory Cross-site scripting Feature fusion Self-attention mechanism Word2Vec
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the era of big data, stealing users’ private data has become one of the main targets of network hackers. In recent years, cross-site scripting (XSS) attacks to obtain users’ privacy data have been one of the main web attack methods of network hackers. Traditional antivirus software cannot identify such cross-site scripting attacks. To identify cross-site scripting attacks quickly and accurately, we proposed a cross-site scripting detection model (C-BLA) with two-channel multi-scale feature fusion embedded in a self-attention mechanism. The model first maps cross-site scripting payloads into spatial vectors by data preprocessing using Word2Vec. Then the two-channel network performs feature extraction on the data. Channel I: extract local features of cross-site scripting payloads at different scales by designing parallel one-dimensional convolutional layers with different convolutional kernel sizes; Channel II: extract semantic information of cross-site scripting payloads from two directions of positive and negative order using a bidirectional Long-Short Term Memory network, and then embed the self-attention mechanism to strengthen the semantic information features. Experiments show that the proposed model achieves a precision rate of 99.8% and a recall rate of 99.1% for cross-site scripting detection, which is a certain improvement in detection rate compared with a single deep learning model and traditional machine learning methods. The two-channel feature fusion of this model better solves the cross-site scripting detection problem.
ISSN:	0167-4048
DOI:	10.1016/j.cose.2022.102990