Comprehensive genomic analyses with 115 plastomes from algae to seed plants: structure, gene contents, GC contents, and introns
Background Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. Ob...
Gespeichert in:
Veröffentlicht in: | Genes & genomics 2020, 42(5), , pp.553-570 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background
Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns.
Objective
This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants.
Methods
Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI.
Results
The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of
trn
genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes,
rrn
, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six
trn
genes, and one
rrn
gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The
trnK-UUU
intron contained the maturase encoding
matK
gene except in the chlorophyte algae and monilophyte ferns in which the
trnK-UUU
was lost, but
matK
retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of
clpP
which was derived from nucleotide insertion. Plastid
trn
introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid
trn
introns, one was at |
---|---|
ISSN: | 1976-9571 2092-9293 |
DOI: | 10.1007/s13258-020-00923-x |