Covariate-assisted spectral clustering

Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreove...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Biometrika 2017-06, Vol.104 (2), p.361-377
Hauptverfasser: BINKIEWICZ, N., VOGELSTEIN, J. T., ROHE, K.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 377
container_issue 2
container_start_page 361
container_title Biometrika
container_volume 104
creator BINKIEWICZ, N.
VOGELSTEIN, J. T.
ROHE, K.
description Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the misclustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.
doi_str_mv 10.1093/biomet/asx008
format Article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5793492</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26363727</jstor_id><sourcerecordid>26363727</sourcerecordid><originalsourceid>FETCH-LOGICAL-c409t-ea40d7db7b133e44c747b8c2ec9382311928e54c9b070649abbc5901c9f04903</originalsourceid><addsrcrecordid>eNpVkEtPwzAQhC0EoqVw5AjqCXEJXT8SxxckVPGSKnHp3bIdt7hK4mInFfx7EqUUOK1259PsaBC6xHCHQdCZdr6yzUzFT4D8CI0xy1hCUwzHaAwAWUIZYyN0FuOmX7M0O0UjIhgFoGSMbuZ-p4JTjU1UjC42tpjGrTVNUOXUlG13CK5en6OTlSqjvdjPCVo-PS7nL8ni7fl1_rBIDAPRJFYxKHihucaUWsYMZ1znhlgjaE4oxoLkNmVGaOCQMaG0NqkAbMQKmAA6QfeD7bbVlS2MrfscchtcpcKX9MrJ_0rt3uXa72TKBWWCdAa3e4PgP1obG1m5aGxZqtr6NkoCgBmkNO1_JQNqgo8x2NXhDQbZVyuHauVQbcdf_812oH-67ICrAdjExodfPaMZ5YTTb_3DgI4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2001405350</pqid></control><display><type>article</type><title>Covariate-assisted spectral clustering</title><source>Jstor Complete Legacy</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>JSTOR Mathematics &amp; Statistics</source><creator>BINKIEWICZ, N. ; VOGELSTEIN, J. T. ; ROHE, K.</creator><creatorcontrib>BINKIEWICZ, N. ; VOGELSTEIN, J. T. ; ROHE, K.</creatorcontrib><description>Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the misclustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.</description><identifier>ISSN: 0006-3444</identifier><identifier>EISSN: 1464-3510</identifier><identifier>DOI: 10.1093/biomet/asx008</identifier><identifier>PMID: 29430032</identifier><language>eng</language><publisher>England: Biometrika Trust</publisher><ispartof>Biometrika, 2017-06, Vol.104 (2), p.361-377</ispartof><rights>2017 Biometrika Trust</rights><rights>2017 Biometrika Trust 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c409t-ea40d7db7b133e44c747b8c2ec9382311928e54c9b070649abbc5901c9f04903</citedby><cites>FETCH-LOGICAL-c409t-ea40d7db7b133e44c747b8c2ec9382311928e54c9b070649abbc5901c9f04903</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26363727$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26363727$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,27901,27902,57992,57996,58225,58229</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29430032$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>BINKIEWICZ, N.</creatorcontrib><creatorcontrib>VOGELSTEIN, J. T.</creatorcontrib><creatorcontrib>ROHE, K.</creatorcontrib><title>Covariate-assisted spectral clustering</title><title>Biometrika</title><addtitle>Biometrika</addtitle><description>Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the misclustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.</description><issn>0006-3444</issn><issn>1464-3510</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNpVkEtPwzAQhC0EoqVw5AjqCXEJXT8SxxckVPGSKnHp3bIdt7hK4mInFfx7EqUUOK1259PsaBC6xHCHQdCZdr6yzUzFT4D8CI0xy1hCUwzHaAwAWUIZYyN0FuOmX7M0O0UjIhgFoGSMbuZ-p4JTjU1UjC42tpjGrTVNUOXUlG13CK5en6OTlSqjvdjPCVo-PS7nL8ni7fl1_rBIDAPRJFYxKHihucaUWsYMZ1znhlgjaE4oxoLkNmVGaOCQMaG0NqkAbMQKmAA6QfeD7bbVlS2MrfscchtcpcKX9MrJ_0rt3uXa72TKBWWCdAa3e4PgP1obG1m5aGxZqtr6NkoCgBmkNO1_JQNqgo8x2NXhDQbZVyuHauVQbcdf_812oH-67ICrAdjExodfPaMZ5YTTb_3DgI4</recordid><startdate>20170601</startdate><enddate>20170601</enddate><creator>BINKIEWICZ, N.</creator><creator>VOGELSTEIN, J. T.</creator><creator>ROHE, K.</creator><general>Biometrika Trust</general><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20170601</creationdate><title>Covariate-assisted spectral clustering</title><author>BINKIEWICZ, N. ; VOGELSTEIN, J. T. ; ROHE, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c409t-ea40d7db7b133e44c747b8c2ec9382311928e54c9b070649abbc5901c9f04903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>BINKIEWICZ, N.</creatorcontrib><creatorcontrib>VOGELSTEIN, J. T.</creatorcontrib><creatorcontrib>ROHE, K.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biometrika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>BINKIEWICZ, N.</au><au>VOGELSTEIN, J. T.</au><au>ROHE, K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Covariate-assisted spectral clustering</atitle><jtitle>Biometrika</jtitle><addtitle>Biometrika</addtitle><date>2017-06-01</date><risdate>2017</risdate><volume>104</volume><issue>2</issue><spage>361</spage><epage>377</epage><pages>361-377</pages><issn>0006-3444</issn><eissn>1464-3510</eissn><abstract>Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the misclustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.</abstract><cop>England</cop><pub>Biometrika Trust</pub><pmid>29430032</pmid><doi>10.1093/biomet/asx008</doi><tpages>17</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0006-3444
ispartof Biometrika, 2017-06, Vol.104 (2), p.361-377
issn 0006-3444
1464-3510
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5793492
source Jstor Complete Legacy; Oxford University Press Journals All Titles (1996-Current); JSTOR Mathematics & Statistics
title Covariate-assisted spectral clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T21%3A27%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Covariate-assisted%20spectral%20clustering&rft.jtitle=Biometrika&rft.au=BINKIEWICZ,%20N.&rft.date=2017-06-01&rft.volume=104&rft.issue=2&rft.spage=361&rft.epage=377&rft.pages=361-377&rft.issn=0006-3444&rft.eissn=1464-3510&rft_id=info:doi/10.1093/biomet/asx008&rft_dat=%3Cjstor_pubme%3E26363727%3C/jstor_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2001405350&rft_id=info:pmid/29430032&rft_jstor_id=26363727&rfr_iscdi=true