Marker genes as predictors of shared genomic function

Although high-throughput marker gene studies provide valuable insight into the diversity and relative abundance of taxa in microbial communities, they do not provide direct measures of their functional capacity. Recently, scientists have shown a general desire to predict functional profiles of micro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC genomics 2019-04, Vol.20 (1), p.268-268, Article 268
Hauptverfasser: Sevigny, Joseph L, Rothenheber, Derek, Diaz, Krystalle Sharlyn, Zhang, Ying, Agustsson, Kristin, Bergeron, R Daniel, Thomas, W Kelley
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Although high-throughput marker gene studies provide valuable insight into the diversity and relative abundance of taxa in microbial communities, they do not provide direct measures of their functional capacity. Recently, scientists have shown a general desire to predict functional profiles of microbial communities based on phylogenetic identification inferred from marker genes, and recent tools have been developed to link the two. However, to date, no large-scale examination has quantified the correlation between the marker gene based taxonomic identity and protein coding gene conservation. Here we utilize 4872 representative prokaryotic genomes from NCBI to investigate the relationship between marker gene identity and shared protein coding gene content. Even at 99-100% marker gene identity, genomes share on average less than 75% of their protein coding gene content. This occurs regardless of the marker gene(s) used: V4 region of the 16S rRNA, complete 16S rRNA, or single copy orthologs through a multi-locus sequence analysis. An important aspect related to this observation is the intra-organism variation of 16S copies from a single genome. Although the majority of 16S copies were found to have high sequence similarity (> 99%), several genomes contained copies that were highly diverged (
ISSN:1471-2164
1471-2164
DOI:10.1186/s12864-019-5641-1