Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data

In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Statistical Association 2024-07, Vol.119 (547), p.1985-1999
Hauptverfasser: Niu, Yabo, Ni, Yang, Pati, Debdeep, Mallick, Bani K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1999
container_issue 547
container_start_page 1985
container_title Journal of the American Statistical Association
container_volume 119
creator Niu, Yabo
Ni, Yang
Pati, Debdeep
Mallick, Bani K.
description In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with ν ∈ ( 0 , 1 ] . We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates. Supplementary materials for this article are available online.
doi_str_mv 10.1080/01621459.2023.2233744
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_39507103</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3105586533</sourcerecordid><originalsourceid>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</originalsourceid><addsrcrecordid>eNqNkc1uEzEUhS0EoqHwCKCR2LCZ4Ou_8awgDdAiRWIDEjvLGd9JXU3sYE-K8vZ4mrQCFghvvPB3j889h5CXQOdANX1LQTEQsp0zyvicMc4bIR6RGUje1KwR3x-T2cTUE3RGnuV8Q8tptH5KzngraQOUz8hiGW9t8nbEepGzzyO66sIeMHsbqstkd9fVCm0KPmyqPqbqCkdMcYMB4z5XH-xon5MnvR0yvjjd5-Tbp49fl1f16svl5-ViVXdC87G2es1wrfqmk22DlLedElwK4EI4x5mjANCLNXMaW6dA9U6DQl2MctpBw_k5eXfU3e3XW3QdhjHZweyS39p0MNF68-dL8NdmE28NlEgUa1lReHNSSPHHHvNotj53OAz2bhvDp-yUoAz-A2VStEqAKOjrv9CbuE-hRFEoKqVWkk_u5ZHqUsw5Yf9gHKiZCjX3hZqpUHMqtMy9-n3rh6n7Bgvw_gj4UPrZ2p8xDc6M9jDE1CcbOn_n419__AJKxq2i</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3105586533</pqid></control><display><type>article</type><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><source>Taylor &amp; Francis:Master (3349 titles)</source><creator>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</creator><creatorcontrib>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</creatorcontrib><description>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with ν ∈ ( 0 , 1 ] . We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates. Supplementary materials for this article are available online.</description><identifier>ISSN: 0162-1459</identifier><identifier>ISSN: 1537-274X</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1080/01621459.2023.2233744</identifier><identifier>PMID: 39507103</identifier><language>eng</language><publisher>United States: Taylor &amp; Francis</publisher><subject>Bayesian analysis ; Bayesian theory ; Breast cancer ; breast neoplasms ; Covariance matrix ; data collection ; Datasets ; Dependency ; Efficacy ; G-Wishart prior ; Gaussian graphical model ; Gene expression ; Genomics ; Graphical models ; Graphs ; Homogeneity ; Matrices ; Normal distribution ; Partition ; Posterior contraction rate ; Product partition model ; Pseudo-likelihood ; Simulation</subject><ispartof>Journal of the American Statistical Association, 2024-07, Vol.119 (547), p.1985-1999</ispartof><rights>2023 American Statistical Association 2023</rights><rights>2023 American Statistical Association</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</citedby><cites>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</cites><orcidid>0000-0001-8087-6747 ; 0000-0003-0636-2363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.tandfonline.com/doi/pdf/10.1080/01621459.2023.2233744$$EPDF$$P50$$Ginformaworld$$H</linktopdf><linktohtml>$$Uhttps://www.tandfonline.com/doi/full/10.1080/01621459.2023.2233744$$EHTML$$P50$$Ginformaworld$$H</linktohtml><link.rule.ids>230,314,776,780,881,27903,27904,59623,60412</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39507103$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Niu, Yabo</creatorcontrib><creatorcontrib>Ni, Yang</creatorcontrib><creatorcontrib>Pati, Debdeep</creatorcontrib><creatorcontrib>Mallick, Bani K.</creatorcontrib><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><title>Journal of the American Statistical Association</title><addtitle>J Am Stat Assoc</addtitle><description>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with ν ∈ ( 0 , 1 ] . We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates. Supplementary materials for this article are available online.</description><subject>Bayesian analysis</subject><subject>Bayesian theory</subject><subject>Breast cancer</subject><subject>breast neoplasms</subject><subject>Covariance matrix</subject><subject>data collection</subject><subject>Datasets</subject><subject>Dependency</subject><subject>Efficacy</subject><subject>G-Wishart prior</subject><subject>Gaussian graphical model</subject><subject>Gene expression</subject><subject>Genomics</subject><subject>Graphical models</subject><subject>Graphs</subject><subject>Homogeneity</subject><subject>Matrices</subject><subject>Normal distribution</subject><subject>Partition</subject><subject>Posterior contraction rate</subject><subject>Product partition model</subject><subject>Pseudo-likelihood</subject><subject>Simulation</subject><issn>0162-1459</issn><issn>1537-274X</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqNkc1uEzEUhS0EoqHwCKCR2LCZ4Ou_8awgDdAiRWIDEjvLGd9JXU3sYE-K8vZ4mrQCFghvvPB3j889h5CXQOdANX1LQTEQsp0zyvicMc4bIR6RGUje1KwR3x-T2cTUE3RGnuV8Q8tptH5KzngraQOUz8hiGW9t8nbEepGzzyO66sIeMHsbqstkd9fVCm0KPmyqPqbqCkdMcYMB4z5XH-xon5MnvR0yvjjd5-Tbp49fl1f16svl5-ViVXdC87G2es1wrfqmk22DlLedElwK4EI4x5mjANCLNXMaW6dA9U6DQl2MctpBw_k5eXfU3e3XW3QdhjHZweyS39p0MNF68-dL8NdmE28NlEgUa1lReHNSSPHHHvNotj53OAz2bhvDp-yUoAz-A2VStEqAKOjrv9CbuE-hRFEoKqVWkk_u5ZHqUsw5Yf9gHKiZCjX3hZqpUHMqtMy9-n3rh6n7Bgvw_gj4UPrZ2p8xDc6M9jDE1CcbOn_n419__AJKxq2i</recordid><startdate>20240702</startdate><enddate>20240702</enddate><creator>Niu, Yabo</creator><creator>Ni, Yang</creator><creator>Pati, Debdeep</creator><creator>Mallick, Bani K.</creator><general>Taylor &amp; Francis</general><general>Taylor &amp; Francis Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope><scope>K9.</scope><scope>7X8</scope><scope>7S9</scope><scope>L.6</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-8087-6747</orcidid><orcidid>https://orcid.org/0000-0003-0636-2363</orcidid></search><sort><creationdate>20240702</creationdate><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><author>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bayesian analysis</topic><topic>Bayesian theory</topic><topic>Breast cancer</topic><topic>breast neoplasms</topic><topic>Covariance matrix</topic><topic>data collection</topic><topic>Datasets</topic><topic>Dependency</topic><topic>Efficacy</topic><topic>G-Wishart prior</topic><topic>Gaussian graphical model</topic><topic>Gene expression</topic><topic>Genomics</topic><topic>Graphical models</topic><topic>Graphs</topic><topic>Homogeneity</topic><topic>Matrices</topic><topic>Normal distribution</topic><topic>Partition</topic><topic>Posterior contraction rate</topic><topic>Product partition model</topic><topic>Pseudo-likelihood</topic><topic>Simulation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Yabo</creatorcontrib><creatorcontrib>Ni, Yang</creatorcontrib><creatorcontrib>Pati, Debdeep</creatorcontrib><creatorcontrib>Mallick, Bani K.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Yabo</au><au>Ni, Yang</au><au>Pati, Debdeep</au><au>Mallick, Bani K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</atitle><jtitle>Journal of the American Statistical Association</jtitle><addtitle>J Am Stat Assoc</addtitle><date>2024-07-02</date><risdate>2024</risdate><volume>119</volume><issue>547</issue><spage>1985</spage><epage>1999</epage><pages>1985-1999</pages><issn>0162-1459</issn><issn>1537-274X</issn><eissn>1537-274X</eissn><abstract>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with ν ∈ ( 0 , 1 ] . We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates. Supplementary materials for this article are available online.</abstract><cop>United States</cop><pub>Taylor &amp; Francis</pub><pmid>39507103</pmid><doi>10.1080/01621459.2023.2233744</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8087-6747</orcidid><orcidid>https://orcid.org/0000-0003-0636-2363</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0162-1459
ispartof Journal of the American Statistical Association, 2024-07, Vol.119 (547), p.1985-1999
issn 0162-1459
1537-274X
1537-274X
language eng
recordid cdi_pubmed_primary_39507103
source Taylor & Francis:Master (3349 titles)
subjects Bayesian analysis
Bayesian theory
Breast cancer
breast neoplasms
Covariance matrix
data collection
Datasets
Dependency
Efficacy
G-Wishart prior
Gaussian graphical model
Gene expression
Genomics
Graphical models
Graphs
Homogeneity
Matrices
Normal distribution
Partition
Posterior contraction rate
Product partition model
Pseudo-likelihood
Simulation
title Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T04%3A08%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Covariate-Assisted%20Bayesian%20Graph%20Learning%20for%20Heterogeneous%20Data&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Niu,%20Yabo&rft.date=2024-07-02&rft.volume=119&rft.issue=547&rft.spage=1985&rft.epage=1999&rft.pages=1985-1999&rft.issn=0162-1459&rft.eissn=1537-274X&rft_id=info:doi/10.1080/01621459.2023.2233744&rft_dat=%3Cproquest_pubme%3E3105586533%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3105586533&rft_id=info:pmid/39507103&rfr_iscdi=true