Gambling Domain Name Recognition via Certificate and Textual Analysis
Abstract On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and in...
Gespeichert in:
Veröffentlicht in: | Computer journal 2023-08, Vol.66 (8), p.1829-1839 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
On-line gambling is the key illegal behaviour of public security department in most countries due to the potential threat to cyberspace security and social stability. Hence, the research on gambling domain names (GDN) classification is quite important and in great demand for academia and industry. Till now, there is very little research work on this topic. Most of the GDN training datasets in previous work were chosen from GDN blacklists provided by publicly available data sources, and the authors did not verify the authenticity and accuracy of these datasets, and the classification results are not particularly satisfactory. In this paper, certificated and textual analysis-based classification method CT-GDNC is proposed to get GDN training data set with an accuracy of 0.9776 and significantly improve the classification results of GDN. The exhaustive comparative experiments on 10K GDN obtained via Bert fine-tuning model and 10K benign data collected from Alex Top 1 million list show that the proposed method achieves new baseline result for GDN classification with classification accuracy 0.9936, precision 0.9936, F1 0.9936 and recall 0.9939. |
---|---|
ISSN: | 0010-4620 1460-2067 |
DOI: | 10.1093/comjnl/bxac043 |