tRNA signatures reveal polyphyletic origins of streamlined SAR11 genomes among the alphaproteobacteria
Phylogenomic analyses are subject to bias from compositional convergence and noise from horizontal gene transfer (HGT). Compositional convergence is a likely cause of controversy regarding phylogeny of the SAR11 group of Alphaproteobacteria that have extremely streamlined, A+T-biased genomes. While...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2013-05 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Phylogenomic analyses are subject to bias from compositional convergence and noise from horizontal gene transfer (HGT). Compositional convergence is a likely cause of controversy regarding phylogeny of the SAR11 group of Alphaproteobacteria that have extremely streamlined, A+T-biased genomes. While careful modeling can reduce artifacts caused by convergence, the most consistent and robust phylogenetic signal in genomes may lie distributed among encoded functional features that govern macromolecular interactions. Here we develop a novel phyloclassification method based on signatures derived from bioinformatically defined tRNA Class-Informative Features (CIFs). tRNA CIFs are enriched for features that underlie tRNA-protein interactions. Using a simple tRNA-CIF-based phyloclassifier, we obtained results consistent with those of bias-corrected whole proteome phylogenomic studies, rejecting monophyly of SAR11 and affiliating most strains with Rhizobiales with strong statistical support. Yet SAR11 and Rickettsiales tRNA genes share distinct patterns of A+T-richness, as expected from their elevated genomic A+T compositions. Using conventional supermatrix methods on total tRNA sequence data, we could recover the artifactual result of a monophyletic SAR11 grouping with Rickettsiales. Thus tRNA CIF-based phyloclassification is more robust to base content convergence than supermatrix phylogenomics on whole tRNA sequences. Also, given the notoriously promiscuous HGT of aminoacyl-tRNA synthetases, tRNA CIF-based phyloclassification may be relatively robust to HGT of network components. We describe how unique features of tRNA-protein interaction networks facilitate the mining of traits governing macromolecular interactions from genomic data, and discuss why interaction-governing traits may be especially useful to solve difficult problems in microbial classification and phylogeny. |
---|---|
ISSN: | 2331-8422 |
DOI: | 10.48550/arxiv.1305.7256 |