Assessing the consistency of public human tissue RNA-seq data sets

Sequencing-based gene expression methods like RNA-sequencing (RNA-seq) have become increasingly common, but it is often claimed that results obtained in different studies are not comparable owing to the influence of laboratory batch effects, differences in RNA extraction and sequencing library prepa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Briefings in bioinformatics 2015-11, Vol.16 (6), p.941-949
Hauptverfasser:	Danielsson, Frida, James, Tojo, Gomez-Cabrero, David, Huss, Mikael
Format:	Artikel
Sprache:	eng
Schlagworte:	Bioinformatics Brain - metabolism clustering Comparative analysis Databases, Genetic Gene expression Gene Expression Profiling Humans Kidney - metabolism Medicin och hälsovetenskap meta-analysis Myocardium - metabolism public data Ribonucleic acid RNA RNA - genetics RNA-seq Sequence Analysis, RNA Tissues
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sequencing-based gene expression methods like RNA-sequencing (RNA-seq) have become increasingly common, but it is often claimed that results obtained in different studies are not comparable owing to the influence of laboratory batch effects, differences in RNA extraction and sequencing library preparation methods and bioinformatics processing pipelines. It would be unfortunate if different experiments were in fact incomparable, as there is great promise in data fusion and meta-analysis applied to sequencing data sets. We therefore compared reported gene expression measurements for ostensibly similar samples (specifically, human brain, heart and kidney samples) in several different RNA-seq studies to assess their overall consistency and to examine the factors contributing most to systematic differences. The same comparisons were also performed after preprocessing all data in a consistent way, eliminating potential bias from bioinformatics pipelines. We conclude that published human tissue RNA-seq expression measurements appear relatively consistent in the sense that samples cluster by tissue rather than laboratory of origin given simple preprocessing transformations. The article is supplemented by a detailed walkthrough with embedded R code and figures.
ISSN:	1467-5463 1477-4054 1477-4054
DOI:	10.1093/bib/bbv017