BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection

With the widespread popularity of online social networks (OSNs), the number of users has also increased exponentially in recent years. At the same time, Social bots, i.e. accounts that controlled by program, are also on the rise. Service providers of OSNs often use them to keep social networks activ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	World wide web (Bussum) 2023-07, Vol.26 (4), p.1793-1809
Hauptverfasser:	Li, Shudong, Zhao, Chuanyu, Li, Qing, Huang, Jiuming, Zhao, Dawei, Zhu, Peican
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial intelligence Computer Science Database Management Datasets Deep learning Embedding Engineering Information Systems Applications (incl.Internet) Machine learning Operating Systems Social networks Software agents User behavior World Wide Web
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the widespread popularity of online social networks (OSNs), the number of users has also increased exponentially in recent years. At the same time, Social bots, i.e. accounts that controlled by program, are also on the rise. Service providers of OSNs often use them to keep social networks active. Meanwhile, some social bots are also registered for malicious purposes. It is necessary to detect these malicious social bots to present a real public opinion environment. We propose BotFinder, a framework to detect malicious social bots in OSNs. Specifically, it combines machine learning and graph methods so that the potential features of social bots can be effectively extracted. Regarding the feature engineering, we generate second order features and use coding methods to encode variables that have high cardinality. These features make full use of both labelled and unlabeled samples. With respect to the graphs, we firstly generate node vectors through embedding method, following which the similarity between vectors of humans and bots can be further calculated; Then, we use an unsupervised method to diffuse labels and thus the performance can be improved again. To valid the performance of the proposed method, we conduct extensive experiments on the dataset provided by an artificial intelligence contest which is composed of over eight million records of users. Results show that our approach reaches a F1-score of 0.8850, which is much better compared to the state of the art.
ISSN:	1386-145X 1573-1413
DOI:	10.1007/s11280-022-01114-2