ECoHeN: A Hypothesis Testing Framework for Extracting Communities from Heterogeneous Networks

Community discovery is the general process of attaining assortative communities from a network: collections of nodes that are densely connected within yet sparsely connected to the rest of the network. While community discovery has been well studied, few such techniques exist for heterogeneous netwo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Gibbs, Connor P, Fosdick, Bailey K, Wilson, James D
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Community discovery is the general process of attaining assortative communities from a network: collections of nodes that are densely connected within yet sparsely connected to the rest of the network. While community discovery has been well studied, few such techniques exist for heterogeneous networks, which contain different types of nodes and possibly different connectivity patterns between the node types. In this paper, we introduce a framework called ECoHeN, which \textbf{e}xtracts \textbf{co}mmunities from a \textbf{he}terogeneous \textbf{n}etwork in a statistically meaningful way. Using a heterogeneous configuration model as a reference distribution, ECoHeN identifies communities that are significantly more densely connected than expected given the node types and connectivity of its membership. Specifically, the ECoHeN algorithm extracts communities one at a time through a dynamic set of iterative updating rules, is guaranteed to converge, and imposes no constraints on the type composition of extracted communities. To our knowledge this is the first discovery method that distinguishes and identifies both homogeneous and heterogeneous, possibly overlapping, community structure in a network. We demonstrate the performance of ECoHeN through simulation and in application to a political blogs network to identify collections of blogs which reference one another more than expected considering the ideology of its' members.
DOI:10.48550/arxiv.2212.10513