POST: A framework for set-based association analysis in high-dimensional data

•Projection onto orthogonal tests is proposed as a new gene-set testing procedure.•POST is a suitable for testing the association of gene-sets with many phenotypes.•POST uses principal components to reduce dimensionality of gene-set data.•Orthogonality of components simplifies derivation of a meanin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Methods (San Diego, Calif.) Calif.), 2018-08, Vol.145, p.76-81
Hauptverfasser: Cao, Xueyuan, George, E. Olusegun, Wang, Mingjuan, Armstrong, Dale B., Cheng, Cheng, Raimondi, Susana, Rubnitz, Jeffrey E., Downing, James R., Kundu, Mondira, Pounds, Stanley B.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 81
container_issue
container_start_page 76
container_title Methods (San Diego, Calif.)
container_volume 145
creator Cao, Xueyuan
George, E. Olusegun
Wang, Mingjuan
Armstrong, Dale B.
Cheng, Cheng
Raimondi, Susana
Rubnitz, Jeffrey E.
Downing, James R.
Kundu, Mondira
Pounds, Stanley B.
description •Projection onto orthogonal tests is proposed as a new gene-set testing procedure.•POST is a suitable for testing the association of gene-sets with many phenotypes.•POST uses principal components to reduce dimensionality of gene-set data.•Orthogonality of components simplifies derivation of a meaningful test statistic. Evaluating the differential expression of a set of genes belonging to a common biological process or ontology has proven to be a very useful tool for biological discovery. However, existing gene-set association methods are limited to applications that evaluate differential expression across k⩾2 treatment groups or biological categories. This limitation precludes researchers from most effectively evaluating the association with other phenotypes that may be more clinically meaningful, such as quantitative variables or censored survival time variables. Projection onto the Orthogonal Space Testing (POST) is proposed as a general procedure that can robustly evaluate the association of a gene-set with several different types of phenotypic data (categorical, ordinal, continuous, or censored). For each gene-set, POST transforms the gene profiles into a set of eigenvectors and then uses statistical modeling to compute a set of z-statistics that measure the association of each eigenvector with the phenotype. The overall gene-set statistic is the sum of squared z-statistics weighted by the corresponding eigenvalues. Finally, bootstrapping is used to compute a p-value. POST may evaluate associations with or without adjustment for covariates. In simulation studies, it is shown that the performance of POST in evaluating the association with a categorical phenotype is similar to or exceeds that of existing methods. In evaluating the association of 875 biological processes with the time to relapse of pediatric acute myeloid leukemia, POST identified the well-known oncogenic WNT signaling pathway as its top hit. These results indicate that POST can be a very useful tool for evaluating the association of a gene-set with a variety of different phenotypes. We have developed an R package named POST which is freely available in Bioconductor.
doi_str_mv 10.1016/j.ymeth.2018.05.011
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2042230322</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1046202317304978</els_id><sourcerecordid>2042230322</sourcerecordid><originalsourceid>FETCH-LOGICAL-c359t-2047be98b6c32c2a54e749fd5aa9ef3170864476abae45dd153d790e83cac0443</originalsourceid><addsrcrecordid>eNp9kE1vFDEMhqMKRNuFX1CpypHLDM7XfFTiUFVQkIqKRDlHnsTTzXY-SjIL2n9PtttyJAfHkl_7tR_GzgSUAkT1YVPuRlrWpQTRlGBKEOKInQhoTdEKBa_2ua4KCVIds9OUNgAgZN28YceyrfMzcMK-fb_9cXfBL3kfcaQ_c3zg_Rx5oqXoMJHnmNLsAi5hnjhOOOxSSDxMfB3u14UPI00pl3DgHhd8y173OCR69_yv2M_Pn-6uvhQ3t9dfry5vCqdMu-SVdN1R23SVU9JJNJpq3fbeILbUK1FDU2ldV9ghaeO9MMrXLVCjHDrQWq3Y-8Pcxzj_2lJa7BiSo2HAieZtstlASgUqhxVTB6mLc0qRevsYw4hxZwXYPUe7sU8c7Z6jBWMzx9x1_myw7Uby_3pewGXBx4OA8pm_A0WbXKDJkQ-R3GL9HP5r8BeuWYRY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2042230322</pqid></control><display><type>article</type><title>POST: A framework for set-based association analysis in high-dimensional data</title><source>MEDLINE</source><source>ScienceDirect Journals (5 years ago - present)</source><creator>Cao, Xueyuan ; George, E. Olusegun ; Wang, Mingjuan ; Armstrong, Dale B. ; Cheng, Cheng ; Raimondi, Susana ; Rubnitz, Jeffrey E. ; Downing, James R. ; Kundu, Mondira ; Pounds, Stanley B.</creator><creatorcontrib>Cao, Xueyuan ; George, E. Olusegun ; Wang, Mingjuan ; Armstrong, Dale B. ; Cheng, Cheng ; Raimondi, Susana ; Rubnitz, Jeffrey E. ; Downing, James R. ; Kundu, Mondira ; Pounds, Stanley B.</creatorcontrib><description>•Projection onto orthogonal tests is proposed as a new gene-set testing procedure.•POST is a suitable for testing the association of gene-sets with many phenotypes.•POST uses principal components to reduce dimensionality of gene-set data.•Orthogonality of components simplifies derivation of a meaningful test statistic. Evaluating the differential expression of a set of genes belonging to a common biological process or ontology has proven to be a very useful tool for biological discovery. However, existing gene-set association methods are limited to applications that evaluate differential expression across k⩾2 treatment groups or biological categories. This limitation precludes researchers from most effectively evaluating the association with other phenotypes that may be more clinically meaningful, such as quantitative variables or censored survival time variables. Projection onto the Orthogonal Space Testing (POST) is proposed as a general procedure that can robustly evaluate the association of a gene-set with several different types of phenotypic data (categorical, ordinal, continuous, or censored). For each gene-set, POST transforms the gene profiles into a set of eigenvectors and then uses statistical modeling to compute a set of z-statistics that measure the association of each eigenvector with the phenotype. The overall gene-set statistic is the sum of squared z-statistics weighted by the corresponding eigenvalues. Finally, bootstrapping is used to compute a p-value. POST may evaluate associations with or without adjustment for covariates. In simulation studies, it is shown that the performance of POST in evaluating the association with a categorical phenotype is similar to or exceeds that of existing methods. In evaluating the association of 875 biological processes with the time to relapse of pediatric acute myeloid leukemia, POST identified the well-known oncogenic WNT signaling pathway as its top hit. These results indicate that POST can be a very useful tool for evaluating the association of a gene-set with a variety of different phenotypes. We have developed an R package named POST which is freely available in Bioconductor.</description><identifier>ISSN: 1046-2023</identifier><identifier>EISSN: 1095-9130</identifier><identifier>DOI: 10.1016/j.ymeth.2018.05.011</identifier><identifier>PMID: 29777750</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Child ; Data integration ; Gene Expression Profiling - methods ; Gene Expression Regulation, Neoplastic ; Gene network ; Gene profiling ; Humans ; Leukemia, Myeloid, Acute - genetics ; Models, Statistical ; Orthogonal projection ; Software</subject><ispartof>Methods (San Diego, Calif.), 2018-08, Vol.145, p.76-81</ispartof><rights>2018 Elsevier Inc.</rights><rights>Copyright © 2018 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c359t-2047be98b6c32c2a54e749fd5aa9ef3170864476abae45dd153d790e83cac0443</citedby><cites>FETCH-LOGICAL-c359t-2047be98b6c32c2a54e749fd5aa9ef3170864476abae45dd153d790e83cac0443</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ymeth.2018.05.011$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3548,27923,27924,45994</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29777750$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cao, Xueyuan</creatorcontrib><creatorcontrib>George, E. Olusegun</creatorcontrib><creatorcontrib>Wang, Mingjuan</creatorcontrib><creatorcontrib>Armstrong, Dale B.</creatorcontrib><creatorcontrib>Cheng, Cheng</creatorcontrib><creatorcontrib>Raimondi, Susana</creatorcontrib><creatorcontrib>Rubnitz, Jeffrey E.</creatorcontrib><creatorcontrib>Downing, James R.</creatorcontrib><creatorcontrib>Kundu, Mondira</creatorcontrib><creatorcontrib>Pounds, Stanley B.</creatorcontrib><title>POST: A framework for set-based association analysis in high-dimensional data</title><title>Methods (San Diego, Calif.)</title><addtitle>Methods</addtitle><description>•Projection onto orthogonal tests is proposed as a new gene-set testing procedure.•POST is a suitable for testing the association of gene-sets with many phenotypes.•POST uses principal components to reduce dimensionality of gene-set data.•Orthogonality of components simplifies derivation of a meaningful test statistic. Evaluating the differential expression of a set of genes belonging to a common biological process or ontology has proven to be a very useful tool for biological discovery. However, existing gene-set association methods are limited to applications that evaluate differential expression across k⩾2 treatment groups or biological categories. This limitation precludes researchers from most effectively evaluating the association with other phenotypes that may be more clinically meaningful, such as quantitative variables or censored survival time variables. Projection onto the Orthogonal Space Testing (POST) is proposed as a general procedure that can robustly evaluate the association of a gene-set with several different types of phenotypic data (categorical, ordinal, continuous, or censored). For each gene-set, POST transforms the gene profiles into a set of eigenvectors and then uses statistical modeling to compute a set of z-statistics that measure the association of each eigenvector with the phenotype. The overall gene-set statistic is the sum of squared z-statistics weighted by the corresponding eigenvalues. Finally, bootstrapping is used to compute a p-value. POST may evaluate associations with or without adjustment for covariates. In simulation studies, it is shown that the performance of POST in evaluating the association with a categorical phenotype is similar to or exceeds that of existing methods. In evaluating the association of 875 biological processes with the time to relapse of pediatric acute myeloid leukemia, POST identified the well-known oncogenic WNT signaling pathway as its top hit. These results indicate that POST can be a very useful tool for evaluating the association of a gene-set with a variety of different phenotypes. We have developed an R package named POST which is freely available in Bioconductor.</description><subject>Child</subject><subject>Data integration</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene Expression Regulation, Neoplastic</subject><subject>Gene network</subject><subject>Gene profiling</subject><subject>Humans</subject><subject>Leukemia, Myeloid, Acute - genetics</subject><subject>Models, Statistical</subject><subject>Orthogonal projection</subject><subject>Software</subject><issn>1046-2023</issn><issn>1095-9130</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kE1vFDEMhqMKRNuFX1CpypHLDM7XfFTiUFVQkIqKRDlHnsTTzXY-SjIL2n9PtttyJAfHkl_7tR_GzgSUAkT1YVPuRlrWpQTRlGBKEOKInQhoTdEKBa_2ua4KCVIds9OUNgAgZN28YceyrfMzcMK-fb_9cXfBL3kfcaQ_c3zg_Rx5oqXoMJHnmNLsAi5hnjhOOOxSSDxMfB3u14UPI00pl3DgHhd8y173OCR69_yv2M_Pn-6uvhQ3t9dfry5vCqdMu-SVdN1R23SVU9JJNJpq3fbeILbUK1FDU2ldV9ghaeO9MMrXLVCjHDrQWq3Y-8Pcxzj_2lJa7BiSo2HAieZtstlASgUqhxVTB6mLc0qRevsYw4hxZwXYPUe7sU8c7Z6jBWMzx9x1_myw7Uby_3pewGXBx4OA8pm_A0WbXKDJkQ-R3GL9HP5r8BeuWYRY</recordid><startdate>20180801</startdate><enddate>20180801</enddate><creator>Cao, Xueyuan</creator><creator>George, E. Olusegun</creator><creator>Wang, Mingjuan</creator><creator>Armstrong, Dale B.</creator><creator>Cheng, Cheng</creator><creator>Raimondi, Susana</creator><creator>Rubnitz, Jeffrey E.</creator><creator>Downing, James R.</creator><creator>Kundu, Mondira</creator><creator>Pounds, Stanley B.</creator><general>Elsevier Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20180801</creationdate><title>POST: A framework for set-based association analysis in high-dimensional data</title><author>Cao, Xueyuan ; George, E. Olusegun ; Wang, Mingjuan ; Armstrong, Dale B. ; Cheng, Cheng ; Raimondi, Susana ; Rubnitz, Jeffrey E. ; Downing, James R. ; Kundu, Mondira ; Pounds, Stanley B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c359t-2047be98b6c32c2a54e749fd5aa9ef3170864476abae45dd153d790e83cac0443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Child</topic><topic>Data integration</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene Expression Regulation, Neoplastic</topic><topic>Gene network</topic><topic>Gene profiling</topic><topic>Humans</topic><topic>Leukemia, Myeloid, Acute - genetics</topic><topic>Models, Statistical</topic><topic>Orthogonal projection</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cao, Xueyuan</creatorcontrib><creatorcontrib>George, E. Olusegun</creatorcontrib><creatorcontrib>Wang, Mingjuan</creatorcontrib><creatorcontrib>Armstrong, Dale B.</creatorcontrib><creatorcontrib>Cheng, Cheng</creatorcontrib><creatorcontrib>Raimondi, Susana</creatorcontrib><creatorcontrib>Rubnitz, Jeffrey E.</creatorcontrib><creatorcontrib>Downing, James R.</creatorcontrib><creatorcontrib>Kundu, Mondira</creatorcontrib><creatorcontrib>Pounds, Stanley B.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Methods (San Diego, Calif.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cao, Xueyuan</au><au>George, E. Olusegun</au><au>Wang, Mingjuan</au><au>Armstrong, Dale B.</au><au>Cheng, Cheng</au><au>Raimondi, Susana</au><au>Rubnitz, Jeffrey E.</au><au>Downing, James R.</au><au>Kundu, Mondira</au><au>Pounds, Stanley B.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>POST: A framework for set-based association analysis in high-dimensional data</atitle><jtitle>Methods (San Diego, Calif.)</jtitle><addtitle>Methods</addtitle><date>2018-08-01</date><risdate>2018</risdate><volume>145</volume><spage>76</spage><epage>81</epage><pages>76-81</pages><issn>1046-2023</issn><eissn>1095-9130</eissn><abstract>•Projection onto orthogonal tests is proposed as a new gene-set testing procedure.•POST is a suitable for testing the association of gene-sets with many phenotypes.•POST uses principal components to reduce dimensionality of gene-set data.•Orthogonality of components simplifies derivation of a meaningful test statistic. Evaluating the differential expression of a set of genes belonging to a common biological process or ontology has proven to be a very useful tool for biological discovery. However, existing gene-set association methods are limited to applications that evaluate differential expression across k⩾2 treatment groups or biological categories. This limitation precludes researchers from most effectively evaluating the association with other phenotypes that may be more clinically meaningful, such as quantitative variables or censored survival time variables. Projection onto the Orthogonal Space Testing (POST) is proposed as a general procedure that can robustly evaluate the association of a gene-set with several different types of phenotypic data (categorical, ordinal, continuous, or censored). For each gene-set, POST transforms the gene profiles into a set of eigenvectors and then uses statistical modeling to compute a set of z-statistics that measure the association of each eigenvector with the phenotype. The overall gene-set statistic is the sum of squared z-statistics weighted by the corresponding eigenvalues. Finally, bootstrapping is used to compute a p-value. POST may evaluate associations with or without adjustment for covariates. In simulation studies, it is shown that the performance of POST in evaluating the association with a categorical phenotype is similar to or exceeds that of existing methods. In evaluating the association of 875 biological processes with the time to relapse of pediatric acute myeloid leukemia, POST identified the well-known oncogenic WNT signaling pathway as its top hit. These results indicate that POST can be a very useful tool for evaluating the association of a gene-set with a variety of different phenotypes. We have developed an R package named POST which is freely available in Bioconductor.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>29777750</pmid><doi>10.1016/j.ymeth.2018.05.011</doi><tpages>6</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1046-2023
ispartof Methods (San Diego, Calif.), 2018-08, Vol.145, p.76-81
issn 1046-2023
1095-9130
language eng
recordid cdi_proquest_miscellaneous_2042230322
source MEDLINE; ScienceDirect Journals (5 years ago - present)
subjects Child
Data integration
Gene Expression Profiling - methods
Gene Expression Regulation, Neoplastic
Gene network
Gene profiling
Humans
Leukemia, Myeloid, Acute - genetics
Models, Statistical
Orthogonal projection
Software
title POST: A framework for set-based association analysis in high-dimensional data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T00%3A54%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=POST:%20A%20framework%20for%20set-based%20association%20analysis%20in%20high-dimensional%20data&rft.jtitle=Methods%20(San%20Diego,%20Calif.)&rft.au=Cao,%20Xueyuan&rft.date=2018-08-01&rft.volume=145&rft.spage=76&rft.epage=81&rft.pages=76-81&rft.issn=1046-2023&rft.eissn=1095-9130&rft_id=info:doi/10.1016/j.ymeth.2018.05.011&rft_dat=%3Cproquest_cross%3E2042230322%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2042230322&rft_id=info:pmid/29777750&rft_els_id=S1046202317304978&rfr_iscdi=true