Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data
In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure....
Gespeichert in:
Veröffentlicht in: | Journal of the American Statistical Association 2024-07, Vol.119 (547), p.1985-1999 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1999 |
---|---|
container_issue | 547 |
container_start_page | 1985 |
container_title | Journal of the American Statistical Association |
container_volume | 119 |
creator | Niu, Yabo Ni, Yang Pati, Debdeep Mallick, Bani K. |
description | In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with
ν
∈
(
0
,
1
]
. We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.
Supplementary materials
for this article are available online. |
doi_str_mv | 10.1080/01621459.2023.2233744 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_39507103</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3105586533</sourcerecordid><originalsourceid>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</originalsourceid><addsrcrecordid>eNqNkc1uEzEUhS0EoqHwCKCR2LCZ4Ou_8awgDdAiRWIDEjvLGd9JXU3sYE-K8vZ4mrQCFghvvPB3j889h5CXQOdANX1LQTEQsp0zyvicMc4bIR6RGUje1KwR3x-T2cTUE3RGnuV8Q8tptH5KzngraQOUz8hiGW9t8nbEepGzzyO66sIeMHsbqstkd9fVCm0KPmyqPqbqCkdMcYMB4z5XH-xon5MnvR0yvjjd5-Tbp49fl1f16svl5-ViVXdC87G2es1wrfqmk22DlLedElwK4EI4x5mjANCLNXMaW6dA9U6DQl2MctpBw_k5eXfU3e3XW3QdhjHZweyS39p0MNF68-dL8NdmE28NlEgUa1lReHNSSPHHHvNotj53OAz2bhvDp-yUoAz-A2VStEqAKOjrv9CbuE-hRFEoKqVWkk_u5ZHqUsw5Yf9gHKiZCjX3hZqpUHMqtMy9-n3rh6n7Bgvw_gj4UPrZ2p8xDc6M9jDE1CcbOn_n419__AJKxq2i</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3105586533</pqid></control><display><type>article</type><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><source>Taylor & Francis:Master (3349 titles)</source><creator>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</creator><creatorcontrib>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</creatorcontrib><description>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with
ν
∈
(
0
,
1
]
. We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.
Supplementary materials
for this article are available online.</description><identifier>ISSN: 0162-1459</identifier><identifier>ISSN: 1537-274X</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1080/01621459.2023.2233744</identifier><identifier>PMID: 39507103</identifier><language>eng</language><publisher>United States: Taylor & Francis</publisher><subject>Bayesian analysis ; Bayesian theory ; Breast cancer ; breast neoplasms ; Covariance matrix ; data collection ; Datasets ; Dependency ; Efficacy ; G-Wishart prior ; Gaussian graphical model ; Gene expression ; Genomics ; Graphical models ; Graphs ; Homogeneity ; Matrices ; Normal distribution ; Partition ; Posterior contraction rate ; Product partition model ; Pseudo-likelihood ; Simulation</subject><ispartof>Journal of the American Statistical Association, 2024-07, Vol.119 (547), p.1985-1999</ispartof><rights>2023 American Statistical Association 2023</rights><rights>2023 American Statistical Association</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</citedby><cites>FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</cites><orcidid>0000-0001-8087-6747 ; 0000-0003-0636-2363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.tandfonline.com/doi/pdf/10.1080/01621459.2023.2233744$$EPDF$$P50$$Ginformaworld$$H</linktopdf><linktohtml>$$Uhttps://www.tandfonline.com/doi/full/10.1080/01621459.2023.2233744$$EHTML$$P50$$Ginformaworld$$H</linktohtml><link.rule.ids>230,314,776,780,881,27903,27904,59623,60412</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39507103$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Niu, Yabo</creatorcontrib><creatorcontrib>Ni, Yang</creatorcontrib><creatorcontrib>Pati, Debdeep</creatorcontrib><creatorcontrib>Mallick, Bani K.</creatorcontrib><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><title>Journal of the American Statistical Association</title><addtitle>J Am Stat Assoc</addtitle><description>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with
ν
∈
(
0
,
1
]
. We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.
Supplementary materials
for this article are available online.</description><subject>Bayesian analysis</subject><subject>Bayesian theory</subject><subject>Breast cancer</subject><subject>breast neoplasms</subject><subject>Covariance matrix</subject><subject>data collection</subject><subject>Datasets</subject><subject>Dependency</subject><subject>Efficacy</subject><subject>G-Wishart prior</subject><subject>Gaussian graphical model</subject><subject>Gene expression</subject><subject>Genomics</subject><subject>Graphical models</subject><subject>Graphs</subject><subject>Homogeneity</subject><subject>Matrices</subject><subject>Normal distribution</subject><subject>Partition</subject><subject>Posterior contraction rate</subject><subject>Product partition model</subject><subject>Pseudo-likelihood</subject><subject>Simulation</subject><issn>0162-1459</issn><issn>1537-274X</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNqNkc1uEzEUhS0EoqHwCKCR2LCZ4Ou_8awgDdAiRWIDEjvLGd9JXU3sYE-K8vZ4mrQCFghvvPB3j889h5CXQOdANX1LQTEQsp0zyvicMc4bIR6RGUje1KwR3x-T2cTUE3RGnuV8Q8tptH5KzngraQOUz8hiGW9t8nbEepGzzyO66sIeMHsbqstkd9fVCm0KPmyqPqbqCkdMcYMB4z5XH-xon5MnvR0yvjjd5-Tbp49fl1f16svl5-ViVXdC87G2es1wrfqmk22DlLedElwK4EI4x5mjANCLNXMaW6dA9U6DQl2MctpBw_k5eXfU3e3XW3QdhjHZweyS39p0MNF68-dL8NdmE28NlEgUa1lReHNSSPHHHvNotj53OAz2bhvDp-yUoAz-A2VStEqAKOjrv9CbuE-hRFEoKqVWkk_u5ZHqUsw5Yf9gHKiZCjX3hZqpUHMqtMy9-n3rh6n7Bgvw_gj4UPrZ2p8xDc6M9jDE1CcbOn_n419__AJKxq2i</recordid><startdate>20240702</startdate><enddate>20240702</enddate><creator>Niu, Yabo</creator><creator>Ni, Yang</creator><creator>Pati, Debdeep</creator><creator>Mallick, Bani K.</creator><general>Taylor & Francis</general><general>Taylor & Francis Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope><scope>K9.</scope><scope>7X8</scope><scope>7S9</scope><scope>L.6</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-8087-6747</orcidid><orcidid>https://orcid.org/0000-0003-0636-2363</orcidid></search><sort><creationdate>20240702</creationdate><title>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</title><author>Niu, Yabo ; Ni, Yang ; Pati, Debdeep ; Mallick, Bani K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c483t-a8b2eb6f7c597e039c643541344dd32d0111f4b2d8e9d616fd816e895030c1733</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bayesian analysis</topic><topic>Bayesian theory</topic><topic>Breast cancer</topic><topic>breast neoplasms</topic><topic>Covariance matrix</topic><topic>data collection</topic><topic>Datasets</topic><topic>Dependency</topic><topic>Efficacy</topic><topic>G-Wishart prior</topic><topic>Gaussian graphical model</topic><topic>Gene expression</topic><topic>Genomics</topic><topic>Graphical models</topic><topic>Graphs</topic><topic>Homogeneity</topic><topic>Matrices</topic><topic>Normal distribution</topic><topic>Partition</topic><topic>Posterior contraction rate</topic><topic>Product partition model</topic><topic>Pseudo-likelihood</topic><topic>Simulation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Yabo</creatorcontrib><creatorcontrib>Ni, Yang</creatorcontrib><creatorcontrib>Pati, Debdeep</creatorcontrib><creatorcontrib>Mallick, Bani K.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Yabo</au><au>Ni, Yang</au><au>Pati, Debdeep</au><au>Mallick, Bani K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data</atitle><jtitle>Journal of the American Statistical Association</jtitle><addtitle>J Am Stat Assoc</addtitle><date>2024-07-02</date><risdate>2024</risdate><volume>119</volume><issue>547</issue><spage>1985</spage><epage>1999</epage><pages>1985-1999</pages><issn>0162-1459</issn><issn>1537-274X</issn><eissn>1537-274X</eissn><abstract>In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussian-conditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any ν-Hölder conditional variance-covariance matrices with
ν
∈
(
0
,
1
]
. We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.
Supplementary materials
for this article are available online.</abstract><cop>United States</cop><pub>Taylor & Francis</pub><pmid>39507103</pmid><doi>10.1080/01621459.2023.2233744</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-8087-6747</orcidid><orcidid>https://orcid.org/0000-0003-0636-2363</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0162-1459 |
ispartof | Journal of the American Statistical Association, 2024-07, Vol.119 (547), p.1985-1999 |
issn | 0162-1459 1537-274X 1537-274X |
language | eng |
recordid | cdi_pubmed_primary_39507103 |
source | Taylor & Francis:Master (3349 titles) |
subjects | Bayesian analysis Bayesian theory Breast cancer breast neoplasms Covariance matrix data collection Datasets Dependency Efficacy G-Wishart prior Gaussian graphical model Gene expression Genomics Graphical models Graphs Homogeneity Matrices Normal distribution Partition Posterior contraction rate Product partition model Pseudo-likelihood Simulation |
title | Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T04%3A08%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Covariate-Assisted%20Bayesian%20Graph%20Learning%20for%20Heterogeneous%20Data&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Niu,%20Yabo&rft.date=2024-07-02&rft.volume=119&rft.issue=547&rft.spage=1985&rft.epage=1999&rft.pages=1985-1999&rft.issn=0162-1459&rft.eissn=1537-274X&rft_id=info:doi/10.1080/01621459.2023.2233744&rft_dat=%3Cproquest_pubme%3E3105586533%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3105586533&rft_id=info:pmid/39507103&rfr_iscdi=true |