Analyzing modularity maximization in approximation, heuristic, and graph neural network algorithms for community detection

Community detection, which involves partitioning nodes within a network, has widespread applications across computational sciences. Modularity-based algorithms identify communities by attempting to maximize the modularity function across network node partitions. Our study assesses the performance of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of computational science 2024-06, Vol.78, p.102283, Article 102283
Hauptverfasser:	Aref, Samin, Mostajabdaveh, Mahdi
Format:	Artikel
Sprache:	eng
Schlagworte:	Approximation Graph neural network Graph optimization Integer programming Modularity maximization Network science
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Community detection, which involves partitioning nodes within a network, has widespread applications across computational sciences. Modularity-based algorithms identify communities by attempting to maximize the modularity function across network node partitions. Our study assesses the performance of various modularity-based algorithms in obtaining optimal partitions. Our analysis utilizes 104 networks, including both real-world instances from diverse contexts and modular graphs from two families of synthetic benchmarks. We analyze ten inexact modularity-based algorithms against the exact integer programming baseline that globally optimizes modularity. Our comparative analysis includes eight heuristics, two variants of a graph neural network algorithm, and nine variations of the Bayan approximation algorithm. Our findings reveal that the average modularity-based heuristic yields optimal partitions in only 43.9% of the 104 networks analyzed. Graph neural networks and approximate Bayan, on average, achieve optimality on 68.7% and 82.3% of the networks respectively. Additionally, our analysis of three partition similarity metrics exposes substantial dissimilarities between high-modularity sub-optimal partitions and any optimal partition of the networks. We observe that near-optimal partitions are often disproportionately dissimilar to any optimal partition. Taken together, our analysis points to a crucial limitation of the commonly used modularity-based methods: they rarely produce an optimal partition or a partition resembling an optimal partition even on networks with modular structures. If modularity is to be used for detecting communities, we recommend approximate optimization algorithms for a methodologically sound usage of modularity within its applicability limits. This article is an extended version of an ICCS 2023 conference paper (Aref et al., 2023).
ISSN:	1877-7503
DOI:	10.1016/j.jocs.2024.102283