Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction

•Conventional LASSO-Cox model could be used to predict cancer patients’ survival by concatenating multi-omics data.•SKI-Cox borrows the information generated by additional types of omics data to guide variable selection.•wLASSO-Cox puts a penalty factor to take the information derived from other omi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Methods (San Diego, Calif.) Calif.), 2017-07, Vol.124, p.100-107
Hauptverfasser: Liu, Cong, Wang, Xujun, Genchev, Georgi Z., Lu, Hui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 107
container_issue
container_start_page 100
container_title Methods (San Diego, Calif.)
container_volume 124
creator Liu, Cong
Wang, Xujun
Genchev, Georgi Z.
Lu, Hui
description •Conventional LASSO-Cox model could be used to predict cancer patients’ survival by concatenating multi-omics data.•SKI-Cox borrows the information generated by additional types of omics data to guide variable selection.•wLASSO-Cox puts a penalty factor to take the information derived from other omics data into account.•Both methods only use mRNA-expression data for prediction. New developments in high-throughput genomic technologies have enabled the measurement of diverse types of omics biomarkers in a cost-efficient and clinically-feasible manner. Developing computational methods and tools for analysis and translation of such genomic data into clinically-relevant information is an ongoing and active area of investigation. For example, several studies have utilized an unsupervised learning framework to cluster patients by integrating omics data. Despite such recent advances, predicting cancer prognosis using integrated omics biomarkers remains a challenge. There is also a shortage of computational tools for predicting cancer prognosis by using supervised learning methods. The current standard approach is to fit a Cox regression model by concatenating the different types of omics data in a linear manner, while penalty could be added for feature selection. A more powerful approach, however, would be to incorporate data by considering relationships among omics datatypes. Here we developed two methods: a SKI-Cox method and a wLASSO-Cox method to incorporate the association among different types of omics data. Both methods fit the Cox proportional hazards model and predict a risk score based on mRNA expression profiles. SKI-Cox borrows the information generated by these additional types of omics data to guide variable selection, while wLASSO-Cox incorporates this information as a penalty factor during model fitting. We show that SKI-Cox and wLASSO-Cox models select more true variables than a LASSO-Cox model in simulation studies. We assess the performance of SKI-Cox and wLASSO-Cox using TCGA glioblastoma multiforme and lung adenocarcinoma data. In each case, mRNA expression, methylation, and copy number variation data are integrated to predict the overall survival time of cancer patients. Our methods achieve better performance in predicting patients’ survival in glioblastoma and lung adenocarcinoma.
doi_str_mv 10.1016/j.ymeth.2017.06.010
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1911207650</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S104620231730066X</els_id><sourcerecordid>1911207650</sourcerecordid><originalsourceid>FETCH-LOGICAL-c404t-b558c3e6f36f461029c60819f1bc208c456b7c5ebb568778bdb10bb09c9980133</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EglL4AiSUJZuEGSdxkgULVPGSitjA2oqdSXGVxGCniP497gOWrOZqdO88DmMXCAkCiutlsu5pfE84YJGASADhgE0QqjyuMIXDjc5EzIGnJ-zU-yUAIC_KY3bCS8GLDMSEqedVN5rY9kb7qK216cxYj9REX7Uzteoo8tSRHo0dIjNEM_sdO1o48n7T6W1DXdRaF-l60OSiD2cXg_XGB0WN2ebO2FFbd57O93XK3u7vXmeP8fzl4Wl2O491BtkYqzwvdUqiTUWbCQReaQElVi0qzaHUWS5UoXNSKhdlUZSqUQhKQaWrqgRM0ym72s0NR3yuyI-yN15T19UD2ZWXWCFyKEQOwZrurNpZ7x218sOZvnZriSA3cOVSbuHKDVwJQga4IXW5X7BSPTV_mV-awXCzM1B488uQk14bCmAa4wJD2Vjz74IfITGM8w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1911207650</pqid></control><display><type>article</type><title>Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Liu, Cong ; Wang, Xujun ; Genchev, Georgi Z. ; Lu, Hui</creator><creatorcontrib>Liu, Cong ; Wang, Xujun ; Genchev, Georgi Z. ; Lu, Hui</creatorcontrib><description>•Conventional LASSO-Cox model could be used to predict cancer patients’ survival by concatenating multi-omics data.•SKI-Cox borrows the information generated by additional types of omics data to guide variable selection.•wLASSO-Cox puts a penalty factor to take the information derived from other omics data into account.•Both methods only use mRNA-expression data for prediction. New developments in high-throughput genomic technologies have enabled the measurement of diverse types of omics biomarkers in a cost-efficient and clinically-feasible manner. Developing computational methods and tools for analysis and translation of such genomic data into clinically-relevant information is an ongoing and active area of investigation. For example, several studies have utilized an unsupervised learning framework to cluster patients by integrating omics data. Despite such recent advances, predicting cancer prognosis using integrated omics biomarkers remains a challenge. There is also a shortage of computational tools for predicting cancer prognosis by using supervised learning methods. The current standard approach is to fit a Cox regression model by concatenating the different types of omics data in a linear manner, while penalty could be added for feature selection. A more powerful approach, however, would be to incorporate data by considering relationships among omics datatypes. Here we developed two methods: a SKI-Cox method and a wLASSO-Cox method to incorporate the association among different types of omics data. Both methods fit the Cox proportional hazards model and predict a risk score based on mRNA expression profiles. SKI-Cox borrows the information generated by these additional types of omics data to guide variable selection, while wLASSO-Cox incorporates this information as a penalty factor during model fitting. We show that SKI-Cox and wLASSO-Cox models select more true variables than a LASSO-Cox model in simulation studies. We assess the performance of SKI-Cox and wLASSO-Cox using TCGA glioblastoma multiforme and lung adenocarcinoma data. In each case, mRNA expression, methylation, and copy number variation data are integrated to predict the overall survival time of cancer patients. Our methods achieve better performance in predicting patients’ survival in glioblastoma and lung adenocarcinoma.</description><identifier>ISSN: 1046-2023</identifier><identifier>EISSN: 1095-9130</identifier><identifier>DOI: 10.1016/j.ymeth.2017.06.010</identifier><identifier>PMID: 28627406</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Adenocarcinoma - diagnosis ; Adenocarcinoma - genetics ; Adenocarcinoma - mortality ; Adenocarcinoma - pathology ; Adenocarcinoma of Lung ; Algorithms ; Breast Neoplasms - diagnosis ; Breast Neoplasms - genetics ; Breast Neoplasms - mortality ; Breast Neoplasms - pathology ; Cancer prognosis prediction ; Cox regression ; DNA Copy Number Variations ; Female ; Gene Expression Profiling ; Gene Expression Regulation, Neoplastic ; Genomics - methods ; Genomics - statistics &amp; numerical data ; Glioblastoma - diagnosis ; Glioblastoma - genetics ; Glioblastoma - mortality ; Glioblastoma - pathology ; Humans ; Lung Neoplasms - diagnosis ; Lung Neoplasms - genetics ; Lung Neoplasms - mortality ; Lung Neoplasms - pathology ; Multi-omics ; Prognosis ; Proportional Hazards Models ; RNA, Messenger - genetics ; RNA, Messenger - metabolism ; Variable selection</subject><ispartof>Methods (San Diego, Calif.), 2017-07, Vol.124, p.100-107</ispartof><rights>2017</rights><rights>Copyright © 2017. Published by Elsevier Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c404t-b558c3e6f36f461029c60819f1bc208c456b7c5ebb568778bdb10bb09c9980133</citedby><cites>FETCH-LOGICAL-c404t-b558c3e6f36f461029c60819f1bc208c456b7c5ebb568778bdb10bb09c9980133</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S104620231730066X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28627406$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Cong</creatorcontrib><creatorcontrib>Wang, Xujun</creatorcontrib><creatorcontrib>Genchev, Georgi Z.</creatorcontrib><creatorcontrib>Lu, Hui</creatorcontrib><title>Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction</title><title>Methods (San Diego, Calif.)</title><addtitle>Methods</addtitle><description>•Conventional LASSO-Cox model could be used to predict cancer patients’ survival by concatenating multi-omics data.•SKI-Cox borrows the information generated by additional types of omics data to guide variable selection.•wLASSO-Cox puts a penalty factor to take the information derived from other omics data into account.•Both methods only use mRNA-expression data for prediction. New developments in high-throughput genomic technologies have enabled the measurement of diverse types of omics biomarkers in a cost-efficient and clinically-feasible manner. Developing computational methods and tools for analysis and translation of such genomic data into clinically-relevant information is an ongoing and active area of investigation. For example, several studies have utilized an unsupervised learning framework to cluster patients by integrating omics data. Despite such recent advances, predicting cancer prognosis using integrated omics biomarkers remains a challenge. There is also a shortage of computational tools for predicting cancer prognosis by using supervised learning methods. The current standard approach is to fit a Cox regression model by concatenating the different types of omics data in a linear manner, while penalty could be added for feature selection. A more powerful approach, however, would be to incorporate data by considering relationships among omics datatypes. Here we developed two methods: a SKI-Cox method and a wLASSO-Cox method to incorporate the association among different types of omics data. Both methods fit the Cox proportional hazards model and predict a risk score based on mRNA expression profiles. SKI-Cox borrows the information generated by these additional types of omics data to guide variable selection, while wLASSO-Cox incorporates this information as a penalty factor during model fitting. We show that SKI-Cox and wLASSO-Cox models select more true variables than a LASSO-Cox model in simulation studies. We assess the performance of SKI-Cox and wLASSO-Cox using TCGA glioblastoma multiforme and lung adenocarcinoma data. In each case, mRNA expression, methylation, and copy number variation data are integrated to predict the overall survival time of cancer patients. Our methods achieve better performance in predicting patients’ survival in glioblastoma and lung adenocarcinoma.</description><subject>Adenocarcinoma - diagnosis</subject><subject>Adenocarcinoma - genetics</subject><subject>Adenocarcinoma - mortality</subject><subject>Adenocarcinoma - pathology</subject><subject>Adenocarcinoma of Lung</subject><subject>Algorithms</subject><subject>Breast Neoplasms - diagnosis</subject><subject>Breast Neoplasms - genetics</subject><subject>Breast Neoplasms - mortality</subject><subject>Breast Neoplasms - pathology</subject><subject>Cancer prognosis prediction</subject><subject>Cox regression</subject><subject>DNA Copy Number Variations</subject><subject>Female</subject><subject>Gene Expression Profiling</subject><subject>Gene Expression Regulation, Neoplastic</subject><subject>Genomics - methods</subject><subject>Genomics - statistics &amp; numerical data</subject><subject>Glioblastoma - diagnosis</subject><subject>Glioblastoma - genetics</subject><subject>Glioblastoma - mortality</subject><subject>Glioblastoma - pathology</subject><subject>Humans</subject><subject>Lung Neoplasms - diagnosis</subject><subject>Lung Neoplasms - genetics</subject><subject>Lung Neoplasms - mortality</subject><subject>Lung Neoplasms - pathology</subject><subject>Multi-omics</subject><subject>Prognosis</subject><subject>Proportional Hazards Models</subject><subject>RNA, Messenger - genetics</subject><subject>RNA, Messenger - metabolism</subject><subject>Variable selection</subject><issn>1046-2023</issn><issn>1095-9130</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMtOwzAQRS0EglL4AiSUJZuEGSdxkgULVPGSitjA2oqdSXGVxGCniP497gOWrOZqdO88DmMXCAkCiutlsu5pfE84YJGASADhgE0QqjyuMIXDjc5EzIGnJ-zU-yUAIC_KY3bCS8GLDMSEqedVN5rY9kb7qK216cxYj9REX7Uzteoo8tSRHo0dIjNEM_sdO1o48n7T6W1DXdRaF-l60OSiD2cXg_XGB0WN2ebO2FFbd57O93XK3u7vXmeP8fzl4Wl2O491BtkYqzwvdUqiTUWbCQReaQElVi0qzaHUWS5UoXNSKhdlUZSqUQhKQaWrqgRM0ym72s0NR3yuyI-yN15T19UD2ZWXWCFyKEQOwZrurNpZ7x218sOZvnZriSA3cOVSbuHKDVwJQga4IXW5X7BSPTV_mV-awXCzM1B488uQk14bCmAa4wJD2Vjz74IfITGM8w</recordid><startdate>20170715</startdate><enddate>20170715</enddate><creator>Liu, Cong</creator><creator>Wang, Xujun</creator><creator>Genchev, Georgi Z.</creator><creator>Lu, Hui</creator><general>Elsevier Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20170715</creationdate><title>Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction</title><author>Liu, Cong ; Wang, Xujun ; Genchev, Georgi Z. ; Lu, Hui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c404t-b558c3e6f36f461029c60819f1bc208c456b7c5ebb568778bdb10bb09c9980133</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Adenocarcinoma - diagnosis</topic><topic>Adenocarcinoma - genetics</topic><topic>Adenocarcinoma - mortality</topic><topic>Adenocarcinoma - pathology</topic><topic>Adenocarcinoma of Lung</topic><topic>Algorithms</topic><topic>Breast Neoplasms - diagnosis</topic><topic>Breast Neoplasms - genetics</topic><topic>Breast Neoplasms - mortality</topic><topic>Breast Neoplasms - pathology</topic><topic>Cancer prognosis prediction</topic><topic>Cox regression</topic><topic>DNA Copy Number Variations</topic><topic>Female</topic><topic>Gene Expression Profiling</topic><topic>Gene Expression Regulation, Neoplastic</topic><topic>Genomics - methods</topic><topic>Genomics - statistics &amp; numerical data</topic><topic>Glioblastoma - diagnosis</topic><topic>Glioblastoma - genetics</topic><topic>Glioblastoma - mortality</topic><topic>Glioblastoma - pathology</topic><topic>Humans</topic><topic>Lung Neoplasms - diagnosis</topic><topic>Lung Neoplasms - genetics</topic><topic>Lung Neoplasms - mortality</topic><topic>Lung Neoplasms - pathology</topic><topic>Multi-omics</topic><topic>Prognosis</topic><topic>Proportional Hazards Models</topic><topic>RNA, Messenger - genetics</topic><topic>RNA, Messenger - metabolism</topic><topic>Variable selection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Cong</creatorcontrib><creatorcontrib>Wang, Xujun</creatorcontrib><creatorcontrib>Genchev, Georgi Z.</creatorcontrib><creatorcontrib>Lu, Hui</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Methods (San Diego, Calif.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Cong</au><au>Wang, Xujun</au><au>Genchev, Georgi Z.</au><au>Lu, Hui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction</atitle><jtitle>Methods (San Diego, Calif.)</jtitle><addtitle>Methods</addtitle><date>2017-07-15</date><risdate>2017</risdate><volume>124</volume><spage>100</spage><epage>107</epage><pages>100-107</pages><issn>1046-2023</issn><eissn>1095-9130</eissn><abstract>•Conventional LASSO-Cox model could be used to predict cancer patients’ survival by concatenating multi-omics data.•SKI-Cox borrows the information generated by additional types of omics data to guide variable selection.•wLASSO-Cox puts a penalty factor to take the information derived from other omics data into account.•Both methods only use mRNA-expression data for prediction. New developments in high-throughput genomic technologies have enabled the measurement of diverse types of omics biomarkers in a cost-efficient and clinically-feasible manner. Developing computational methods and tools for analysis and translation of such genomic data into clinically-relevant information is an ongoing and active area of investigation. For example, several studies have utilized an unsupervised learning framework to cluster patients by integrating omics data. Despite such recent advances, predicting cancer prognosis using integrated omics biomarkers remains a challenge. There is also a shortage of computational tools for predicting cancer prognosis by using supervised learning methods. The current standard approach is to fit a Cox regression model by concatenating the different types of omics data in a linear manner, while penalty could be added for feature selection. A more powerful approach, however, would be to incorporate data by considering relationships among omics datatypes. Here we developed two methods: a SKI-Cox method and a wLASSO-Cox method to incorporate the association among different types of omics data. Both methods fit the Cox proportional hazards model and predict a risk score based on mRNA expression profiles. SKI-Cox borrows the information generated by these additional types of omics data to guide variable selection, while wLASSO-Cox incorporates this information as a penalty factor during model fitting. We show that SKI-Cox and wLASSO-Cox models select more true variables than a LASSO-Cox model in simulation studies. We assess the performance of SKI-Cox and wLASSO-Cox using TCGA glioblastoma multiforme and lung adenocarcinoma data. In each case, mRNA expression, methylation, and copy number variation data are integrated to predict the overall survival time of cancer patients. Our methods achieve better performance in predicting patients’ survival in glioblastoma and lung adenocarcinoma.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>28627406</pmid><doi>10.1016/j.ymeth.2017.06.010</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1046-2023
ispartof Methods (San Diego, Calif.), 2017-07, Vol.124, p.100-107
issn 1046-2023
1095-9130
language eng
recordid cdi_proquest_miscellaneous_1911207650
source MEDLINE; Elsevier ScienceDirect Journals
subjects Adenocarcinoma - diagnosis
Adenocarcinoma - genetics
Adenocarcinoma - mortality
Adenocarcinoma - pathology
Adenocarcinoma of Lung
Algorithms
Breast Neoplasms - diagnosis
Breast Neoplasms - genetics
Breast Neoplasms - mortality
Breast Neoplasms - pathology
Cancer prognosis prediction
Cox regression
DNA Copy Number Variations
Female
Gene Expression Profiling
Gene Expression Regulation, Neoplastic
Genomics - methods
Genomics - statistics & numerical data
Glioblastoma - diagnosis
Glioblastoma - genetics
Glioblastoma - mortality
Glioblastoma - pathology
Humans
Lung Neoplasms - diagnosis
Lung Neoplasms - genetics
Lung Neoplasms - mortality
Lung Neoplasms - pathology
Multi-omics
Prognosis
Proportional Hazards Models
RNA, Messenger - genetics
RNA, Messenger - metabolism
Variable selection
title Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T15%3A47%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-omics%20facilitated%20variable%20selection%20in%20Cox-regression%20model%20for%20cancer%20prognosis%20prediction&rft.jtitle=Methods%20(San%20Diego,%20Calif.)&rft.au=Liu,%20Cong&rft.date=2017-07-15&rft.volume=124&rft.spage=100&rft.epage=107&rft.pages=100-107&rft.issn=1046-2023&rft.eissn=1095-9130&rft_id=info:doi/10.1016/j.ymeth.2017.06.010&rft_dat=%3Cproquest_cross%3E1911207650%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1911207650&rft_id=info:pmid/28627406&rft_els_id=S104620231730066X&rfr_iscdi=true