Phylogenetic trees based on gene content

Comparing gene content between species can be a useful approach for reconstructing phylogenetic trees. In this paper, we derive a maximum-likelihood estimation of evolutionary distance between species under a simple model of gene genesis and gene loss. Using simulated data on a biological tree with...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2004-09, Vol.20 (13), p.2044-2049
Hauptverfasser: Huson, Daniel H., Steel, Mike
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Comparing gene content between species can be a useful approach for reconstructing phylogenetic trees. In this paper, we derive a maximum-likelihood estimation of evolutionary distance between species under a simple model of gene genesis and gene loss. Using simulated data on a biological tree with 107 taxa (and on a number of randomly generated trees), we compare the accuracy of tree reconstruction using this ML distance measure to an earlier ad hoc distance. We then compare these distance-based approaches to a character-based tree reconstruction method (Dollo parsimony) which seems well suited to the analysis of gene content data. To simplify simulations, we give a formal proof of the well-known ‘fact’ that the Dollo parsimony score is independent of the choice of root. Our results show a consistent trend, with the character-based method and ML distance measure outperforming the earlier ad hoc distance method. Availability: http://www.ab.informatik.uni-tuebingen.de/software/genecontent/welcome_en.html
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/bth198