Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Nature biotechnology 2010-05, Vol.28 (5), p.511-515
Hauptverfasser:	Pachter, Lior, Trapnell, Cole, Williams, Brian A, Pertea, Geo, Mortazavi, Ali, Kwan, Gordon, van Baren, Marijke J, Salzberg, Steven L, Wold, Barbara J
Format:	Artikel
Sprache:	eng
Schlagworte:	631/114/794 631/136/142 631/61/212/2019 631/61/514/2254 Agriculture Algorithms Animals Bioinformatics Biological and medical sciences Biomedical and Life Sciences Biomedical Engineering/Biotechnology Biomedicine Biotechnology Cell differentiation Cell Differentiation - genetics Cell Line Cellular biology Diverse techniques Fundamental and applied biological sciences. Psychology Gene expression Gene Expression Profiling - methods Genetic aspects Genetic transcription Genome letter Life Sciences Messenger RNA Mice Molecular and cellular biology Oligonucleotide Array Sequence Analysis - methods Open source software Properties Protein Isoforms - genetics Protein Isoforms - metabolism Proto-Oncogene Proteins c-myc - genetics Proto-Oncogene Proteins c-myc - metabolism Ribonucleic acid RNA RNA, Messenger - analysis RNA, Messenger - genetics RNA, Messenger - metabolism Sequence Analysis, RNA - methods Software Time series
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.
ISSN:	1087-0156 1546-1696
DOI:	10.1038/nbt.1621