Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold
The stochastic block model is a canonical random graph model for clustering and community detection on network-structured data. Decades of extensive study on the problem have established many profound results, among which the phase transition at the Kesten-Stigum threshold is particularly interestin...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The stochastic block model is a canonical random graph model for clustering
and community detection on network-structured data. Decades of extensive study
on the problem have established many profound results, among which the phase
transition at the Kesten-Stigum threshold is particularly interesting both from
a mathematical and an applied standpoint. It states that no estimator based on
the network topology can perform substantially better than chance on sparse
graphs if the model parameter is below a certain threshold. Nevertheless, if we
slightly extend the horizon to the ubiquitous semi-supervised setting, such a
fundamental limitation will disappear completely. We prove that with an
arbitrary fraction of the labels revealed, the detection problem is feasible
throughout the parameter domain. Moreover, we introduce two efficient
algorithms, one combinatorial and one based on optimization, to integrate label
information with graph structures. Our work brings a new perspective to the
stochastic model of networks and semidefinite program research. |
---|---|
DOI: | 10.48550/arxiv.2205.11677 |