Gambling Domain Name Recognition via Certificate and Textual Analysis

Abstract On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2023-08, Vol.66 (8), p.1829-1839
Hauptverfasser: Sun, GuoYing, Ye, Feng, Chai, Tingting, Zhang, Zhaoxin, Tong, Xiaojun, Prasad, Shitala
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and industry. Till now, there is very little research work on this topic. Most of the GDN training datasets in previous work were chosen from GDN blacklists provided by publicly available data sources, and the authors did not verify the authenticity and accuracy of these datasets, and the classification results are not particularly satisfactory. In this paper, certificated and textual analysis-based classification method CT-GDNC is proposed to get GDN training data set with an accuracy of 0.9776 and significantly improve the classification results of GDN. The exhaustive comparative experiments on 10K GDN obtained via Bert fine-tuning model and 10K benign data collected from Alex Top 1 million list show that the proposed method achieves new baseline result for GDN classification with classification accuracy 0.9936, precision 0.9936, F1 0.9936 and recall 0.9939.
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxac043