Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma

RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2021-05, Vol.22 (3)
Hauptverfasser: Zhang, Zheyang, Zhang, Sainan, Li, Xin, Zhao, Zhangxiang, Chen, Changjing, Zhang, Juxuan, Li, Mengyue, Wei, Zixin, Jiang, Wenbin, Pan, Bo, Li, Ying, Liu, Yixin, Cao, Yingyue, Zhao, Wenyuan, Gu, Yunyan, Yu, Yan, Meng, Qingwei, Qi, Lishuang
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P 
ISSN:1467-5463
1477-4054
DOI:10.1093/bib/bbaa081