Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data

Understanding the regulation of gene expression, including transcription start site usage, alternative splicing, and polyadenylation, requires accurate quantification of expression levels down to the level of individual transcript isoforms. To comparatively evaluate the accuracy of the many methods...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Genome Biology 2015-07, Vol.16 (1), p.150-150, Article 150
Hauptverfasser:	Kanitz, Alexander, Gypas, Foivos, Gruber, Andreas J, Gruber, Andreas R, Martin, Georges, Zavolan, Mihaela
Format:	Artikel
Sprache:	eng
Schlagworte:	Abundance Accuracy Algorithms Alternative splicing Analysis Animals Archives & records Bioinformatics Comparative analysis Computer applications Gene expression Gene Expression Profiling - methods Genes Genetic aspects Genetic transcription Genomes Genomics High-Throughput Nucleotide Sequencing - methods Humans Isoforms Jurkat Cells Mammals Methods Mice NIH 3T3 Cells Polyadenylation Proteins Ribonucleic acid RNA RNA Isoforms - analysis RNA sequencing Sequence Analysis, RNA - methods Software Transcription
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Understanding the regulation of gene expression, including transcription start site usage, alternative splicing, and polyadenylation, requires accurate quantification of expression levels down to the level of individual transcript isoforms. To comparatively evaluate the accuracy of the many methods that have been proposed for estimating transcript isoform abundance from RNA sequencing data, we have used both synthetic data as well as an independent experimental method for quantifying the abundance of transcript ends at the genome-wide level. We found that many tools have good accuracy and yield better estimates of gene-level expression compared to commonly used count-based approaches, but they vary widely in memory and runtime requirements. Nucleotide composition and intron/exon structure have comparatively little influence on the accuracy of expression estimates, which correlates most strongly with transcript/gene expression levels. To facilitate the reproduction and further extension of our study, we provide datasets, source code, and an online analysis tool on a companion website, where developers can upload expression estimates obtained with their own tool to compare them to those inferred by the methods assessed here. As many methods for quantifying isoform abundance with comparable accuracy are available, a user's choice will likely be determined by factors such as the memory and runtime requirements, as well as the availability of methods for downstream analyses. Sequencing-based methods to quantify the abundance of specific transcript regions could complement validation schemes based on synthetic data and quantitative PCR in future or ongoing assessments of RNA-seq analysis methods.
ISSN:	1465-6906 1474-7596 1474-760X 1465-6906 1465-6914
DOI:	10.1186/s13059-015-0702-5