Gene Ontology based housekeeping gene selection for RNA-seq normalization

•A novel transcriptomic data normalization method based on housekeeping genes.•The housekeeping genes are selected by GO distance and stability analysis.•Normalization results showed the proposed method outperformed traditional approaches.•A web-based online system is available for 12 model species....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Methods (San Diego, Calif.) Calif.), 2014-06, Vol.67 (3), p.354-363
Hauptverfasser: Chen, Chien-Ming, Lu, Yu-Lun, Sio, Chi-Pong, Wu, Guan-Chung, Tzou, Wen-Shyong, Pai, Tun-Wen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A novel transcriptomic data normalization method based on housekeeping genes.•The housekeeping genes are selected by GO distance and stability analysis.•Normalization results showed the proposed method outperformed traditional approaches.•A web-based online system is available for 12 model species. RNA-seq analysis provides a powerful tool for revealing relationships between gene expression level and biological function of proteins. In order to identify differentially expressed genes among various RNA-seq datasets obtained from different experimental designs, an appropriate normalization method for calibrating multiple experimental datasets is the first challenging problem. We propose a novel method to facilitate biologists in selecting a set of suitable housekeeping genes for inter-sample normalization. The approach is achieved by adopting user defined experimentally related keywords, GO annotations, GO term distance matrices, orthologous housekeeping gene candidates, and stability ranking of housekeeping genes. By identifying the most distanced GO terms from query keywords and selecting housekeeping gene candidates with low coefficients of variation among different spatio-temporal datasets, the proposed method can automatically enumerate a set of functionally irrelevant housekeeping genes for pratical normalization. Novel and benchmark testing RNA-seq datasets were applied to demostrate that different selections of housekeeping gene lead to strong impact on differential gene expression analysis, and compared results have shown that our proposed method outperformed other traditional approaches in terms of both sensitivity and specificity. The proposed mechanism of selecting appropriate houskeeping genes for inter-dataset normalization is robust and accurate for differential expression analyses.
ISSN:1046-2023
1095-9130
DOI:10.1016/j.ymeth.2014.01.019