On the Minimax Misclassification Ratio of Hypergraph Community Detection

Community detection in hypergraphs is explored. Under a generative hypergraph model called " d -wise hypergraph stochastic block model" ( d - \mathtt {hSBM} ), which naturally extends the stochastic block model ( \mathtt {SBM} ) from graphs to d -uniform hypergraphs, the fundamental limit...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information theory 2019-12, Vol.65 (12), p.8095-8118
Hauptverfasser:	Chien, I Eli, Lin, Chung-Yi, Wang, I-Hsiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Asymptotic properties asymptotic theory Bayes methods Clustering Clustering algorithms Community detection Community relations Computational modeling Consistency Decay rate Divergence Graph theory Graphs hypergraph clustering hypergraph Laplacian Image edge detection Lower bounds Maximum likelihood estimation Minimax technique Parameter estimation Polynomials Probabilistic logic Probabilistic models Probability theory Risk Statistical analysis stochastic block model (SBM) Stochastic processes two-step algorithm Upper bounds
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Community detection in hypergraphs is explored. Under a generative hypergraph model called " d -wise hypergraph stochastic block model" ( d - \mathtt {hSBM} ), which naturally extends the stochastic block model ( \mathtt {SBM} ) from graphs to d -uniform hypergraphs, the fundamental limit of the misclassification ratio (the loss function in the community detection problem) is studied. For the converse part, a lower bound of the minimax risk, that is, the minimax expected misclassification ratio, is derived. Asymptotically, it decays exponentially fast to zero as the number of nodes tends to infinity, and the rate function is a weighted combination of several divergence terms, each of which is the Rényi divergence of order 1/2 between two Bernoulli distributions. The Bernoulli distributions involved in the characterization of the rate function are those governing the random instantiation of hyperedges in d - \mathtt {hSBM} . For the achievability part, we propose a two-step polynomial-time algorithm which, with high probability, has a misclassification ratio with a decaying exponent that is asymptotically greater than or equal to that of our proposed lower bound. The first step of the algorithm is a hypergraph spectral clustering method, which achieves partial recovery to a certain precision level. The second step is a local refinement method, which leverages the underlying probabilistic model along with parameter estimation from the outcome of the first step. To characterize the asymptotic performance of the proposed algorithm, we first derive a sufficient condition for attaining weak consistency in the hypergraph spectral clustering step. Then, under the guarantee of weak consistency in the first step, we upper bound the loss (with high probability) attained in the local refinement step by an exponentially decaying function of the size of the hypergraph and characterize the decaying rate. Compared to existing works in \mat
ISSN:	0018-9448 1557-9654
DOI:	10.1109/TIT.2019.2928301