Exact Recovery in the General Hypergraph Stochastic Block Model
This paper investigates fundamental limits of exact recovery in the general d -uniform hypergraph stochastic block model ( d -HSBM), wherein n nodes are partitioned into k disjoint communities with relative sizes (p_{1},\ldots , p_{k}) . Each subset of nodes with cardinality d is generated i...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on information theory 2023-01, Vol.69 (1), p.453-471 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 471 |
---|---|
container_issue | 1 |
container_start_page | 453 |
container_title | IEEE transactions on information theory |
container_volume | 69 |
creator | Zhang, Qiaosheng Tan, Vincent Y. F. |
description | This paper investigates fundamental limits of exact recovery in the general d -uniform hypergraph stochastic block model ( d -HSBM), wherein n nodes are partitioned into k disjoint communities with relative sizes (p_{1},\ldots , p_{k}) . Each subset of nodes with cardinality d is generated independently as an order- d hyperedge with a certain probability that depends on the ground-truth communities that the d nodes belong to. The goal is to exactly recover the k hidden communities based on the observed hypergraph. We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below the threshold (apart from a small regime of parameters that will be specified precisely). This threshold is represented in terms of a quantity which we term as the generalized Chernoff-Hellinger divergence between communities. Our result for this general model recovers prior results for the standard SBM and d -HSBM with two symmetric communities as special cases. En route to proving our achievability results, we develop a polynomial-time two-stage algorithm that meets the threshold. The first stage adopts a certain hypergraph spectral clustering method to obtain a coarse estimate of communities, and the second stage refines each node individually via local refinement steps to ensure exact recovery. |
doi_str_mv | 10.1109/TIT.2022.3205959 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2757181657</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9887955</ieee_id><sourcerecordid>2757181657</sourcerecordid><originalsourceid>FETCH-LOGICAL-c338t-b02538e6419b99e7ea786606e471e799770ef924fc2bd143db070a183cc2b3cb3</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKt3wUvA89Z8bpKTaKltoSJoPYdsOmu3rs2abMX-e7ds8TS8w_POwIPQNSUjSom5W86XI0YYG3FGpJHmBA2olCozuRSnaEAI1ZkRQp-ji5Q2XRSSsgG6n_w63-JX8OEH4h5XW9yuAU9hC9HVeLZvIH5E16zxWxv82qW28vixDv4TP4cV1JforHR1gqvjHKL3p8lyPMsWL9P5-GGRec51mxWESa4hF9QUxoACp3SekxyEoqCMUYpAaZgoPStWVPBVQRRxVHPfLbgv-BDd9nebGL53kFq7Cbu47V5apqSimuZSdRTpKR9DShFK28Tqy8W9pcQeNNlOkz1oskdNXeWmr1QA8I8brZWRkv8Bl3hh8A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2757181657</pqid></control><display><type>article</type><title>Exact Recovery in the General Hypergraph Stochastic Block Model</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Qiaosheng ; Tan, Vincent Y. F.</creator><creatorcontrib>Zhang, Qiaosheng ; Tan, Vincent Y. F.</creatorcontrib><description><![CDATA[This paper investigates fundamental limits of exact recovery in the general <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-uniform hypergraph stochastic block model (<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM), wherein <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> nodes are partitioned into <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> disjoint communities with relative sizes <inline-formula> <tex-math notation="LaTeX">(p_{1},\ldots , p_{k}) </tex-math></inline-formula>. Each subset of nodes with cardinality <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> is generated independently as an order-<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> hyperedge with a certain probability that depends on the ground-truth communities that the <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> nodes belong to. The goal is to exactly recover the <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> hidden communities based on the observed hypergraph. We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below the threshold (apart from a small regime of parameters that will be specified precisely). This threshold is represented in terms of a quantity which we term as the generalized Chernoff-Hellinger divergence between communities. Our result for this general model recovers prior results for the standard SBM and <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM with two symmetric communities as special cases. En route to proving our achievability results, we develop a polynomial-time two-stage algorithm that meets the threshold. The first stage adopts a certain hypergraph spectral clustering method to obtain a coarse estimate of communities, and the second stage refines each node individually via local refinement steps to ensure exact recovery.]]></description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2022.3205959</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Clustering ; Clustering algorithms ; Clustering methods ; Community detection ; Electronic mail ; exact recovery ; Graph theory ; Graphs ; hypergraph spectral clustering methods ; hypergraph stochastic block model (HSBM) ; Nodes ; Partitioning algorithms ; Polynomials ; Random variables ; Recovery ; Stochastic processes ; Tensors</subject><ispartof>IEEE transactions on information theory, 2023-01, Vol.69 (1), p.453-471</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c338t-b02538e6419b99e7ea786606e471e799770ef924fc2bd143db070a183cc2b3cb3</citedby><cites>FETCH-LOGICAL-c338t-b02538e6419b99e7ea786606e471e799770ef924fc2bd143db070a183cc2b3cb3</cites><orcidid>0000-0001-6114-8453 ; 0000-0002-5008-4527</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9887955$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9887955$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhang, Qiaosheng</creatorcontrib><creatorcontrib>Tan, Vincent Y. F.</creatorcontrib><title>Exact Recovery in the General Hypergraph Stochastic Block Model</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description><![CDATA[This paper investigates fundamental limits of exact recovery in the general <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-uniform hypergraph stochastic block model (<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM), wherein <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> nodes are partitioned into <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> disjoint communities with relative sizes <inline-formula> <tex-math notation="LaTeX">(p_{1},\ldots , p_{k}) </tex-math></inline-formula>. Each subset of nodes with cardinality <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> is generated independently as an order-<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> hyperedge with a certain probability that depends on the ground-truth communities that the <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> nodes belong to. The goal is to exactly recover the <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> hidden communities based on the observed hypergraph. We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below the threshold (apart from a small regime of parameters that will be specified precisely). This threshold is represented in terms of a quantity which we term as the generalized Chernoff-Hellinger divergence between communities. Our result for this general model recovers prior results for the standard SBM and <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM with two symmetric communities as special cases. En route to proving our achievability results, we develop a polynomial-time two-stage algorithm that meets the threshold. The first stage adopts a certain hypergraph spectral clustering method to obtain a coarse estimate of communities, and the second stage refines each node individually via local refinement steps to ensure exact recovery.]]></description><subject>Algorithms</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Clustering methods</subject><subject>Community detection</subject><subject>Electronic mail</subject><subject>exact recovery</subject><subject>Graph theory</subject><subject>Graphs</subject><subject>hypergraph spectral clustering methods</subject><subject>hypergraph stochastic block model (HSBM)</subject><subject>Nodes</subject><subject>Partitioning algorithms</subject><subject>Polynomials</subject><subject>Random variables</subject><subject>Recovery</subject><subject>Stochastic processes</subject><subject>Tensors</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKt3wUvA89Z8bpKTaKltoSJoPYdsOmu3rs2abMX-e7ds8TS8w_POwIPQNSUjSom5W86XI0YYG3FGpJHmBA2olCozuRSnaEAI1ZkRQp-ji5Q2XRSSsgG6n_w63-JX8OEH4h5XW9yuAU9hC9HVeLZvIH5E16zxWxv82qW28vixDv4TP4cV1JforHR1gqvjHKL3p8lyPMsWL9P5-GGRec51mxWESa4hF9QUxoACp3SekxyEoqCMUYpAaZgoPStWVPBVQRRxVHPfLbgv-BDd9nebGL53kFq7Cbu47V5apqSimuZSdRTpKR9DShFK28Tqy8W9pcQeNNlOkz1oskdNXeWmr1QA8I8brZWRkv8Bl3hh8A</recordid><startdate>202301</startdate><enddate>202301</enddate><creator>Zhang, Qiaosheng</creator><creator>Tan, Vincent Y. F.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-6114-8453</orcidid><orcidid>https://orcid.org/0000-0002-5008-4527</orcidid></search><sort><creationdate>202301</creationdate><title>Exact Recovery in the General Hypergraph Stochastic Block Model</title><author>Zhang, Qiaosheng ; Tan, Vincent Y. F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c338t-b02538e6419b99e7ea786606e471e799770ef924fc2bd143db070a183cc2b3cb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Clustering methods</topic><topic>Community detection</topic><topic>Electronic mail</topic><topic>exact recovery</topic><topic>Graph theory</topic><topic>Graphs</topic><topic>hypergraph spectral clustering methods</topic><topic>hypergraph stochastic block model (HSBM)</topic><topic>Nodes</topic><topic>Partitioning algorithms</topic><topic>Polynomials</topic><topic>Random variables</topic><topic>Recovery</topic><topic>Stochastic processes</topic><topic>Tensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Qiaosheng</creatorcontrib><creatorcontrib>Tan, Vincent Y. F.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Qiaosheng</au><au>Tan, Vincent Y. F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exact Recovery in the General Hypergraph Stochastic Block Model</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2023-01</date><risdate>2023</risdate><volume>69</volume><issue>1</issue><spage>453</spage><epage>471</epage><pages>453-471</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract><![CDATA[This paper investigates fundamental limits of exact recovery in the general <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-uniform hypergraph stochastic block model (<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM), wherein <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> nodes are partitioned into <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> disjoint communities with relative sizes <inline-formula> <tex-math notation="LaTeX">(p_{1},\ldots , p_{k}) </tex-math></inline-formula>. Each subset of nodes with cardinality <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> is generated independently as an order-<inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> hyperedge with a certain probability that depends on the ground-truth communities that the <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> nodes belong to. The goal is to exactly recover the <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> hidden communities based on the observed hypergraph. We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below the threshold (apart from a small regime of parameters that will be specified precisely). This threshold is represented in terms of a quantity which we term as the generalized Chernoff-Hellinger divergence between communities. Our result for this general model recovers prior results for the standard SBM and <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-HSBM with two symmetric communities as special cases. En route to proving our achievability results, we develop a polynomial-time two-stage algorithm that meets the threshold. The first stage adopts a certain hypergraph spectral clustering method to obtain a coarse estimate of communities, and the second stage refines each node individually via local refinement steps to ensure exact recovery.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2022.3205959</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0001-6114-8453</orcidid><orcidid>https://orcid.org/0000-0002-5008-4527</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9448 |
ispartof | IEEE transactions on information theory, 2023-01, Vol.69 (1), p.453-471 |
issn | 0018-9448 1557-9654 |
language | eng |
recordid | cdi_proquest_journals_2757181657 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Clustering Clustering algorithms Clustering methods Community detection Electronic mail exact recovery Graph theory Graphs hypergraph spectral clustering methods hypergraph stochastic block model (HSBM) Nodes Partitioning algorithms Polynomials Random variables Recovery Stochastic processes Tensors |
title | Exact Recovery in the General Hypergraph Stochastic Block Model |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T15%3A47%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exact%20Recovery%20in%20the%20General%20Hypergraph%20Stochastic%20Block%20Model&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Zhang,%20Qiaosheng&rft.date=2023-01&rft.volume=69&rft.issue=1&rft.spage=453&rft.epage=471&rft.pages=453-471&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2022.3205959&rft_dat=%3Cproquest_RIE%3E2757181657%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2757181657&rft_id=info:pmid/&rft_ieee_id=9887955&rfr_iscdi=true |