BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection
With the widespread popularity of online social networks (OSNs), the number of users has also increased exponentially in recent years. At the same time, Social bots, i.e. accounts that controlled by program, are also on the rise. Service providers of OSNs often use them to keep social networks activ...
Gespeichert in:
Veröffentlicht in: | World wide web (Bussum) 2023-07, Vol.26 (4), p.1793-1809 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the widespread popularity of online social networks (OSNs), the number of users has also increased exponentially in recent years. At the same time, Social bots, i.e. accounts that controlled by program, are also on the rise. Service providers of OSNs often use them to keep social networks active. Meanwhile, some social bots are also registered for malicious purposes. It is necessary to detect these malicious social bots to present a real public opinion environment. We propose BotFinder, a framework to detect malicious social bots in OSNs. Specifically, it combines machine learning and graph methods so that the potential features of social bots can be effectively extracted. Regarding the feature engineering, we generate second order features and use coding methods to encode variables that have high cardinality. These features make full use of both labelled and unlabeled samples. With respect to the graphs, we firstly generate node vectors through embedding method, following which the similarity between vectors of humans and bots can be further calculated; Then, we use an unsupervised method to diffuse labels and thus the performance can be improved again. To valid the performance of the proposed method, we conduct extensive experiments on the dataset provided by an artificial intelligence contest which is composed of over eight million records of users. Results show that our approach reaches a F1-score of 0.8850, which is much better compared to the state of the art. |
---|---|
ISSN: | 1386-145X 1573-1413 |
DOI: | 10.1007/s11280-022-01114-2 |