Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma

RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2021-05, Vol.22 (3)
Hauptverfasser: Zhang, Zheyang, Zhang, Sainan, Li, Xin, Zhao, Zhangxiang, Chen, Changjing, Zhang, Juxuan, Li, Mengyue, Wei, Zixin, Jiang, Wenbin, Pan, Bo, Li, Ying, Liu, Yixin, Cao, Yingyue, Zhao, Wenyuan, Gu, Yunyan, Yu, Yan, Meng, Qingwei, Qi, Lishuang
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 3
container_start_page
container_title Briefings in bioinformatics
container_volume 22
creator Zhang, Zheyang
Zhang, Sainan
Li, Xin
Zhao, Zhangxiang
Chen, Changjing
Zhang, Juxuan
Li, Mengyue
Wei, Zixin
Jiang, Wenbin
Pan, Bo
Li, Ying
Liu, Yixin
Cao, Yingyue
Zhao, Wenyuan
Gu, Yunyan
Yu, Yan
Meng, Qingwei
Qi, Lishuang
description RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P 
doi_str_mv 10.1093/bib/bbaa081
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2400548192</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2400548192</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</originalsourceid><addsrcrecordid>eNo9UU1r3DAQFaEl36fci46FsI1kyZacWwlpGggUSnM2I2lsFLzSVpIh-4_yMyOTbQ_DzDzePGbmEXLF2TfOenFjvLkxBoBpfkROuVRqI1krP611pzat7MQJOcv5hbGGKc2PyYlohBZStqfk7TeOmDBYpBOGuEUKwdUIsUDxMdBl56BgpjOCoyVSG0NJ4LwtMe3pLsUpxFy8rSWuaJ3J1IdVDSm-VjTnVSf7KUBZantLgVrISHNZ3J7GkVYQbUFXEZiQPtJ5CRMFVxeykKyve8EF-TzCnPHykM_J84_7P3c_N0-_Hh7vvj9tbKP7Uq_tjWqN48r2zLJOtqMChgqbsRNdJ7VuoJVMCyc6hUoJga0G3kvTMYnCiHPy9UO3nvZ3wVyGrc8W5xkCxiUPjWT1uZr3TaVef1BtijknHIdd8ltI-4GzYbVmqNYMB2sq-8tBeDFbdP-5_7wQ7xqljmI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2400548192</pqid></control><display><type>article</type><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Business Source Complete</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</creator><creatorcontrib>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</creatorcontrib><description>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P &lt; 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbaa081</identifier><identifier>PMID: 32383445</identifier><language>eng</language><publisher>England</publisher><ispartof>Briefings in bioinformatics, 2021-05, Vol.22 (3)</ispartof><rights>The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</citedby><cites>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</cites><orcidid>0000-0002-2991-6544 ; 0000-0001-5693-4126</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32383445$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Zheyang</creatorcontrib><creatorcontrib>Zhang, Sainan</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><creatorcontrib>Zhao, Zhangxiang</creatorcontrib><creatorcontrib>Chen, Changjing</creatorcontrib><creatorcontrib>Zhang, Juxuan</creatorcontrib><creatorcontrib>Li, Mengyue</creatorcontrib><creatorcontrib>Wei, Zixin</creatorcontrib><creatorcontrib>Jiang, Wenbin</creatorcontrib><creatorcontrib>Pan, Bo</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Liu, Yixin</creatorcontrib><creatorcontrib>Cao, Yingyue</creatorcontrib><creatorcontrib>Zhao, Wenyuan</creatorcontrib><creatorcontrib>Gu, Yunyan</creatorcontrib><creatorcontrib>Yu, Yan</creatorcontrib><creatorcontrib>Meng, Qingwei</creatorcontrib><creatorcontrib>Qi, Lishuang</creatorcontrib><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P &lt; 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</description><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9UU1r3DAQFaEl36fci46FsI1kyZacWwlpGggUSnM2I2lsFLzSVpIh-4_yMyOTbQ_DzDzePGbmEXLF2TfOenFjvLkxBoBpfkROuVRqI1krP611pzat7MQJOcv5hbGGKc2PyYlohBZStqfk7TeOmDBYpBOGuEUKwdUIsUDxMdBl56BgpjOCoyVSG0NJ4LwtMe3pLsUpxFy8rSWuaJ3J1IdVDSm-VjTnVSf7KUBZantLgVrISHNZ3J7GkVYQbUFXEZiQPtJ5CRMFVxeykKyve8EF-TzCnPHykM_J84_7P3c_N0-_Hh7vvj9tbKP7Uq_tjWqN48r2zLJOtqMChgqbsRNdJ7VuoJVMCyc6hUoJga0G3kvTMYnCiHPy9UO3nvZ3wVyGrc8W5xkCxiUPjWT1uZr3TaVef1BtijknHIdd8ltI-4GzYbVmqNYMB2sq-8tBeDFbdP-5_7wQ7xqljmI</recordid><startdate>20210520</startdate><enddate>20210520</enddate><creator>Zhang, Zheyang</creator><creator>Zhang, Sainan</creator><creator>Li, Xin</creator><creator>Zhao, Zhangxiang</creator><creator>Chen, Changjing</creator><creator>Zhang, Juxuan</creator><creator>Li, Mengyue</creator><creator>Wei, Zixin</creator><creator>Jiang, Wenbin</creator><creator>Pan, Bo</creator><creator>Li, Ying</creator><creator>Liu, Yixin</creator><creator>Cao, Yingyue</creator><creator>Zhao, Wenyuan</creator><creator>Gu, Yunyan</creator><creator>Yu, Yan</creator><creator>Meng, Qingwei</creator><creator>Qi, Lishuang</creator><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-2991-6544</orcidid><orcidid>https://orcid.org/0000-0001-5693-4126</orcidid></search><sort><creationdate>20210520</creationdate><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><author>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Zheyang</creatorcontrib><creatorcontrib>Zhang, Sainan</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><creatorcontrib>Zhao, Zhangxiang</creatorcontrib><creatorcontrib>Chen, Changjing</creatorcontrib><creatorcontrib>Zhang, Juxuan</creatorcontrib><creatorcontrib>Li, Mengyue</creatorcontrib><creatorcontrib>Wei, Zixin</creatorcontrib><creatorcontrib>Jiang, Wenbin</creatorcontrib><creatorcontrib>Pan, Bo</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Liu, Yixin</creatorcontrib><creatorcontrib>Cao, Yingyue</creatorcontrib><creatorcontrib>Zhao, Wenyuan</creatorcontrib><creatorcontrib>Gu, Yunyan</creatorcontrib><creatorcontrib>Yu, Yan</creatorcontrib><creatorcontrib>Meng, Qingwei</creatorcontrib><creatorcontrib>Qi, Lishuang</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Zheyang</au><au>Zhang, Sainan</au><au>Li, Xin</au><au>Zhao, Zhangxiang</au><au>Chen, Changjing</au><au>Zhang, Juxuan</au><au>Li, Mengyue</au><au>Wei, Zixin</au><au>Jiang, Wenbin</au><au>Pan, Bo</au><au>Li, Ying</au><au>Liu, Yixin</au><au>Cao, Yingyue</au><au>Zhao, Wenyuan</au><au>Gu, Yunyan</au><au>Yu, Yan</au><au>Meng, Qingwei</au><au>Qi, Lishuang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2021-05-20</date><risdate>2021</risdate><volume>22</volume><issue>3</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P &lt; 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</abstract><cop>England</cop><pmid>32383445</pmid><doi>10.1093/bib/bbaa081</doi><orcidid>https://orcid.org/0000-0002-2991-6544</orcidid><orcidid>https://orcid.org/0000-0001-5693-4126</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2021-05, Vol.22 (3)
issn 1467-5463
1477-4054
language eng
recordid cdi_proquest_miscellaneous_2400548192
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Business Source Complete; Oxford Journals Open Access Collection; PubMed Central
title Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T22%3A31%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reference%20genome%20and%20annotation%20updates%20lead%20to%20contradictory%20prognostic%20predictions%20in%20gene%20expression%20signatures:%20a%20case%20study%20of%20resected%20stage%20I%20lung%20adenocarcinoma&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Zhang,%20Zheyang&rft.date=2021-05-20&rft.volume=22&rft.issue=3&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbaa081&rft_dat=%3Cproquest_cross%3E2400548192%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2400548192&rft_id=info:pmid/32383445&rfr_iscdi=true