Characterization of topic-based online communities by combining network data and user generated content

This study proposes a model for characterizing online communities by combining two types of data: network data and user-generated-content (UGC). The existing models for detecting the community structure of a network employ only network information. However, not all people connected in a network shar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistics and computing 2020-09, Vol.30 (5), p.1309-1324
Hauptverfasser: Igarashi, Mirai, Terui, Nobuhiko
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study proposes a model for characterizing online communities by combining two types of data: network data and user-generated-content (UGC). The existing models for detecting the community structure of a network employ only network information. However, not all people connected in a network share the same interests. For instance, even if students belong to the same community of “school,” they may have various hobbies such as music, books, or sports. Hence, it is more realistic and beneficial for companies to identify communities according to their interests uncovered by their communications on social media. In addition, people may belong to multiple communities such as family, work, and online friends. Our model explores multiple overlapping communities according to their topics identified using two types of data jointly. By way of validating the main features of the proposed model, our simulation study shows that the model correctly identifies the community structure that could not be found without considering both network data and UGC. Furthermore, an empirical analysis using Twitter data clarifies that our model can find realistic and meaningful community structures from large online networks and has a good predictive performance.
ISSN:0960-3174
1573-1375
DOI:10.1007/s11222-020-09947-5