Identification of Arabidopsis genic and non‐genic promoters by paired‐end sequencing of TSS tags

Summary Information about transcription start sites (TSSs) provides baseline data for the analysis of promoter architecture. In this paper we used paired‐ and single‐end deep sequencing to analyze Arabidopsis TSS tags from several libraries prepared from roots, shoots, flowers and etiolated seedling...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Plant journal : for cell and molecular biology 2017-05, Vol.90 (3), p.587-605
Hauptverfasser: Tokizawa, Mutsutomo, Kusunoki, Kazutaka, Koyama, Hiroyuki, Kurotani, Atsushi, Sakurai, Tetsuya, Suzuki, Yutaka, Sakamoto, Tomoaki, Kurata, Tetsuya, Yamamoto, Yoshiharu Y.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Information about transcription start sites (TSSs) provides baseline data for the analysis of promoter architecture. In this paper we used paired‐ and single‐end deep sequencing to analyze Arabidopsis TSS tags from several libraries prepared from roots, shoots, flowers and etiolated seedlings. The clustering of approximately 33 million mapped TSS tags led to the identification of 324 461 promoters that covered 79.7% (21 672/27 206) of protein‐coding genes in the Arabidopsis genome. In addition we identified intragenic, antisense and orphan promoters that were not associated with any gene models. Of these, intragenic promoters exhibited unique characteristics regarding dinucleotide sequences at TSSs and core promoter element composition, suggesting that these promoters use different mechanisms of transcriptional initiation. An analysis of base composition with regard to promoter position revealed a low GC content throughout the promoter region and several local strand biases that were evident for TATA‐type promoters, but not for Coreless‐type promoters. Most observed strand biases coincided with strand biases of single nucleotide polymorphism rate. Our analysis also revealed that transcription of a gene is supported by an average of 2.7 genic promoters, among which one specific promoter, designated as a top promoter, substantially determines the expression level of the gene. Significance Statement All genes require a promoter, a region that determines the position, direction, frequency, and timing of transcription. Here we describe the use of transcription start site tags from various Arabidopsis tissues to identify promoters for approximately 80% of annotated genes, as well as many intragenic, antisense, and Orphan promoters.
ISSN:0960-7412
1365-313X
DOI:10.1111/tpj.13511