The C- and G-value paradox with polyploidy, repeatomes, introns, phenomes and cell economy

Background The apparent disconnection between biological complexity and both genome size (C-value) and gene number (G-value) is one of the long-standing biological puzzles. Gene-dense genomic sequences in prokaryotes or simple eukaryotes are highly constrained during selection, whereas gene-sparse g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genes & genomics 2020, 42(7), , pp.699-714
Hauptverfasser: Choi, Ik-Young, Kwon, Eun-Chae, Kim, Nam-Soo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background The apparent disconnection between biological complexity and both genome size (C-value) and gene number (G-value) is one of the long-standing biological puzzles. Gene-dense genomic sequences in prokaryotes or simple eukaryotes are highly constrained during selection, whereas gene-sparse genomic sequences in higher eukaryotes have low selection constraints. This review discusses the correlations of the C-value and G-value with genome architecture, polyploidy, repeatomes, introns, cell economy and phenomes. Discussion Eukaryotic chromosomes carry an assortment of various repeated DNA sequences (repeatomes). Expansion of copies of repeatomes together with polyploidization or whole-genome duplication (WGD) are major players in genome size (C-value) bloating, but genomes are equipped with counterbalancing systems such as diploidization, illegitimate recombination, and nonhomologous end joining (NHEJ) after double-strand breaks (DSBs). The lack of these efficient purging systems allowed the accumulation of repeat DNA, which resulted in extremely large genomes in several species. However, the correlation between chromosome number and genome size is not clear due to inconsistent results with different sets of species. Positive correlations between genome size and intron size and density were reported in early studies, but these proposals were refuted by the results with increased numbers of species, in which genome-wide features of introns (size, density, gene contents, repeats) were weakly associated with genome size. The assumption of the correlations between C-value and gene number (G-value) and organismal complexity is acceptable in general, but this assumption is often violated in specific lineages or species, suggesting C- and G-value paradoxes. The C-value paradox is partly explained by noncoding repeatomes. The G-value paradox can also be explained by several genomic features: (1) one gene can produce many mature mRNAs by alternative splicing, and eukaryotic gene expression is highly regulated at both the transcriptional and translational levels; (2) many proteins exert multiple functions during development; (3) gene expansion/contraction are frequent events in the gene family among evolutionarily close species; and (4) sets of homeotic genes regulate development such that organismal complexity is sometimes not clear among organisms. A large genome must be burdensome in terms of cell economy, such that a large genome constraint results in the di
ISSN:1976-9571
2092-9293
DOI:10.1007/s13258-020-00941-9