Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma
RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients...
Gespeichert in:
Veröffentlicht in: | Briefings in bioinformatics 2021-05, Vol.22 (3) |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 3 |
container_start_page | |
container_title | Briefings in bioinformatics |
container_volume | 22 |
creator | Zhang, Zheyang Zhang, Sainan Li, Xin Zhao, Zhangxiang Chen, Changjing Zhang, Juxuan Li, Mengyue Wei, Zixin Jiang, Wenbin Pan, Bo Li, Ying Liu, Yixin Cao, Yingyue Zhao, Wenyuan Gu, Yunyan Yu, Yan Meng, Qingwei Qi, Lishuang |
description | RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P |
doi_str_mv | 10.1093/bib/bbaa081 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2400548192</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2400548192</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</originalsourceid><addsrcrecordid>eNo9UU1r3DAQFaEl36fci46FsI1kyZacWwlpGggUSnM2I2lsFLzSVpIh-4_yMyOTbQ_DzDzePGbmEXLF2TfOenFjvLkxBoBpfkROuVRqI1krP611pzat7MQJOcv5hbGGKc2PyYlohBZStqfk7TeOmDBYpBOGuEUKwdUIsUDxMdBl56BgpjOCoyVSG0NJ4LwtMe3pLsUpxFy8rSWuaJ3J1IdVDSm-VjTnVSf7KUBZantLgVrISHNZ3J7GkVYQbUFXEZiQPtJ5CRMFVxeykKyve8EF-TzCnPHykM_J84_7P3c_N0-_Hh7vvj9tbKP7Uq_tjWqN48r2zLJOtqMChgqbsRNdJ7VuoJVMCyc6hUoJga0G3kvTMYnCiHPy9UO3nvZ3wVyGrc8W5xkCxiUPjWT1uZr3TaVef1BtijknHIdd8ltI-4GzYbVmqNYMB2sq-8tBeDFbdP-5_7wQ7xqljmI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2400548192</pqid></control><display><type>article</type><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Business Source Complete</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</creator><creatorcontrib>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</creatorcontrib><description>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P < 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbaa081</identifier><identifier>PMID: 32383445</identifier><language>eng</language><publisher>England</publisher><ispartof>Briefings in bioinformatics, 2021-05, Vol.22 (3)</ispartof><rights>The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</citedby><cites>FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</cites><orcidid>0000-0002-2991-6544 ; 0000-0001-5693-4126</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32383445$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Zheyang</creatorcontrib><creatorcontrib>Zhang, Sainan</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><creatorcontrib>Zhao, Zhangxiang</creatorcontrib><creatorcontrib>Chen, Changjing</creatorcontrib><creatorcontrib>Zhang, Juxuan</creatorcontrib><creatorcontrib>Li, Mengyue</creatorcontrib><creatorcontrib>Wei, Zixin</creatorcontrib><creatorcontrib>Jiang, Wenbin</creatorcontrib><creatorcontrib>Pan, Bo</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Liu, Yixin</creatorcontrib><creatorcontrib>Cao, Yingyue</creatorcontrib><creatorcontrib>Zhao, Wenyuan</creatorcontrib><creatorcontrib>Gu, Yunyan</creatorcontrib><creatorcontrib>Yu, Yan</creatorcontrib><creatorcontrib>Meng, Qingwei</creatorcontrib><creatorcontrib>Qi, Lishuang</creatorcontrib><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P < 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</description><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9UU1r3DAQFaEl36fci46FsI1kyZacWwlpGggUSnM2I2lsFLzSVpIh-4_yMyOTbQ_DzDzePGbmEXLF2TfOenFjvLkxBoBpfkROuVRqI1krP611pzat7MQJOcv5hbGGKc2PyYlohBZStqfk7TeOmDBYpBOGuEUKwdUIsUDxMdBl56BgpjOCoyVSG0NJ4LwtMe3pLsUpxFy8rSWuaJ3J1IdVDSm-VjTnVSf7KUBZantLgVrISHNZ3J7GkVYQbUFXEZiQPtJ5CRMFVxeykKyve8EF-TzCnPHykM_J84_7P3c_N0-_Hh7vvj9tbKP7Uq_tjWqN48r2zLJOtqMChgqbsRNdJ7VuoJVMCyc6hUoJga0G3kvTMYnCiHPy9UO3nvZ3wVyGrc8W5xkCxiUPjWT1uZr3TaVef1BtijknHIdd8ltI-4GzYbVmqNYMB2sq-8tBeDFbdP-5_7wQ7xqljmI</recordid><startdate>20210520</startdate><enddate>20210520</enddate><creator>Zhang, Zheyang</creator><creator>Zhang, Sainan</creator><creator>Li, Xin</creator><creator>Zhao, Zhangxiang</creator><creator>Chen, Changjing</creator><creator>Zhang, Juxuan</creator><creator>Li, Mengyue</creator><creator>Wei, Zixin</creator><creator>Jiang, Wenbin</creator><creator>Pan, Bo</creator><creator>Li, Ying</creator><creator>Liu, Yixin</creator><creator>Cao, Yingyue</creator><creator>Zhao, Wenyuan</creator><creator>Gu, Yunyan</creator><creator>Yu, Yan</creator><creator>Meng, Qingwei</creator><creator>Qi, Lishuang</creator><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-2991-6544</orcidid><orcidid>https://orcid.org/0000-0001-5693-4126</orcidid></search><sort><creationdate>20210520</creationdate><title>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</title><author>Zhang, Zheyang ; Zhang, Sainan ; Li, Xin ; Zhao, Zhangxiang ; Chen, Changjing ; Zhang, Juxuan ; Li, Mengyue ; Wei, Zixin ; Jiang, Wenbin ; Pan, Bo ; Li, Ying ; Liu, Yixin ; Cao, Yingyue ; Zhao, Wenyuan ; Gu, Yunyan ; Yu, Yan ; Meng, Qingwei ; Qi, Lishuang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-549b75bd17c90c0645f7a0e7e2f63664882a54083d367e7733e58a194b604e3b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Zheyang</creatorcontrib><creatorcontrib>Zhang, Sainan</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><creatorcontrib>Zhao, Zhangxiang</creatorcontrib><creatorcontrib>Chen, Changjing</creatorcontrib><creatorcontrib>Zhang, Juxuan</creatorcontrib><creatorcontrib>Li, Mengyue</creatorcontrib><creatorcontrib>Wei, Zixin</creatorcontrib><creatorcontrib>Jiang, Wenbin</creatorcontrib><creatorcontrib>Pan, Bo</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Liu, Yixin</creatorcontrib><creatorcontrib>Cao, Yingyue</creatorcontrib><creatorcontrib>Zhao, Wenyuan</creatorcontrib><creatorcontrib>Gu, Yunyan</creatorcontrib><creatorcontrib>Yu, Yan</creatorcontrib><creatorcontrib>Meng, Qingwei</creatorcontrib><creatorcontrib>Qi, Lishuang</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Zheyang</au><au>Zhang, Sainan</au><au>Li, Xin</au><au>Zhao, Zhangxiang</au><au>Chen, Changjing</au><au>Zhang, Juxuan</au><au>Li, Mengyue</au><au>Wei, Zixin</au><au>Jiang, Wenbin</au><au>Pan, Bo</au><au>Li, Ying</au><au>Liu, Yixin</au><au>Cao, Yingyue</au><au>Zhao, Wenyuan</au><au>Gu, Yunyan</au><au>Yu, Yan</au><au>Meng, Qingwei</au><au>Qi, Lishuang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2021-05-20</date><risdate>2021</risdate><volume>22</volume><issue>3</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P < 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.</abstract><cop>England</cop><pmid>32383445</pmid><doi>10.1093/bib/bbaa081</doi><orcidid>https://orcid.org/0000-0002-2991-6544</orcidid><orcidid>https://orcid.org/0000-0001-5693-4126</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1467-5463 |
ispartof | Briefings in bioinformatics, 2021-05, Vol.22 (3) |
issn | 1467-5463 1477-4054 |
language | eng |
recordid | cdi_proquest_miscellaneous_2400548192 |
source | Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Business Source Complete; Oxford Journals Open Access Collection; PubMed Central |
title | Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T22%3A31%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reference%20genome%20and%20annotation%20updates%20lead%20to%20contradictory%20prognostic%20predictions%20in%20gene%20expression%20signatures:%20a%20case%20study%20of%20resected%20stage%20I%20lung%20adenocarcinoma&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Zhang,%20Zheyang&rft.date=2021-05-20&rft.volume=22&rft.issue=3&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbaa081&rft_dat=%3Cproquest_cross%3E2400548192%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2400548192&rft_id=info:pmid/32383445&rfr_iscdi=true |