A study on the application of the T5 large language model in encrypted traffic classification

In the era of mobile Internet, the widespread use of VPNs increases the demand for data security and privacy but also poses challenges for ISPs in terms of quality of service and traffic monitoring. The research in this paper focuses on how to accurately classify encrypted traffic. Traditional metho...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Peer-to-peer networking and applications 2025-02, Vol.18 (1), p.1-13
Hauptverfasser:	Luo, Jian, Chen, Zechao, Chen, Wenxiong, Lu, Huali, Lyu, Feng
Format:	Artikel
Sprache:	eng
Schlagworte:	Access control Accuracy Classification Communication Communications Engineering Computer Communication Networks Effectiveness Engineering Information Systems and Communication Service Labeling Language Large language models Machine learning Methods Network security Networks Peer to peer computing Privacy Security management Signal,Image and Speech Processing Virtual private networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the era of mobile Internet, the widespread use of VPNs increases the demand for data security and privacy but also poses challenges for ISPs in terms of quality of service and traffic monitoring. The research in this paper focuses on how to accurately classify encrypted traffic. Traditional methods usually require manual labeling of features, which suffers from high cost and unstable accuracy. Due to the special characteristics of encrypted traffic, traditional labeling methods cannot be well adapted, so new solutions are urgently needed. In this paper, a generative learning method based on large-scale language models is adopted, which fuses encrypted traffic features into the T5 language model. The fine-tune T5 model conducts transfer learning with a small amount of data and achieve good classification accuracy. Compared with the traditional methods, the model performs better in terms of classification effectiveness. It can effectively classify encrypted traffic even with a small number of samples, and distinguish between VPN and non-VPN traffic. Test results on the ISCX VPN-nonVPN dataset show that the new generative classifier improves the F1 score to 98.5%, which is a 5.5% improvement compared to the previous one. The experiments show that the method is effective and efficient.
ISSN:	1936-6442 1936-6450
DOI:	10.1007/s12083-024-01817-5