Highly parallelized inference of large genome-based phylogenies
SUMMARYGenome Blast Distance Phylogeny (GBDP) infers distances and phylogenetic relationships between organisms from completely or partially sequenced genomes. It is well suited for parallelization as pairwise distances are calculated independently. As exemplar data for a high‐performance cluster im...
Gespeichert in:
Veröffentlicht in: | Concurrency and computation 2014-07, Vol.26 (10), p.1715-1729 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | SUMMARYGenome Blast Distance Phylogeny (GBDP) infers distances and phylogenetic relationships between organisms from completely or partially sequenced genomes. It is well suited for parallelization as pairwise distances are calculated independently. As exemplar data for a high‐performance cluster implementation that executes many pairwise genome comparisons in parallel, we here used sequences from the Genomic Encyclopedia of Bacteria and Archaea project. Phylogenies were inferred from genome‐scale nucleotide and amino acid data with all variants of GBDP, including novel adaptations to amino acid sequences and approaches yielding trees with branch support. The dependency of phylogenetic accuracy, average branch support as well as performance indicators such as running time and disk space consumption on details of genome comparison, distance calculation, and phylogenetic inference was examined in detail. If combined with conservative measures for branch support, GBDP appears to infer reasonable phylogenetic relationships of microorganisms with a comparatively low computational cost. Due to the linear speed‐up of the cluster, benchmarks reveal an overall computation time of less than 24 h required for the 7750 pairwise genome/proteome comparisons of the Genomic Encyclopedia of Bacteria and Archaea data set that is opposed to an estimated running time of about 30 days for the non‐parallelized version. Copyright © 2013 John Wiley & Sons, Ltd. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.3112 |