Estimating Phylogenies from Lacunose Distance Matrices, with Special Reference to DNA Hybridization Data
Distance methods for producing phylogenies require n super(2) comparisons among n taxa to generate a complete matrix. Moreover, techniques for generating distances-such as DNA hybridization-are subject to both systematic and random experimental errors, so that the measurements do not satisfy the mat...
Gespeichert in:
Veröffentlicht in: | Molecular biology and evolution 1995-03, Vol.12 (2), p.266-284 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Distance methods for producing phylogenies require n super(2) comparisons among n taxa to generate a complete matrix. Moreover, techniques for generating distances-such as DNA hybridization-are subject to both systematic and random experimental errors, so that the measurements do not satisfy the mathematical properties of distances. We have explored the possibility of reconstructing trees from incomplete data. In our simulations, we discard one or both of reciprocal pairs from a complete matrix, estimate these values, reconstruct a tree, and compare the topology and branch lengths of the estimated tree with the phylogeny based on complete data. We investigated separately and jointly the effects of rate variation and random and systematic errors, added to a fabricated ultrametric matrix, and then passed on to simulation experiments with several complete DNA hybridization matrices. Our empirical results show that topological and metric recovery is always very good provided no terminal sister taxa lack both reciprocal measurements or extremely short internodes are involved. We then present two applications of the method for estimating phylogenies from incomplete DNA hybridization matrices-the first illustrating reconstruction of a matrix with about 27% of missing cells, and the second suturing two matrices where some data are held in common but 29% are missing from the combined table. Thus, considerable information may be implicit in very sparse matrices, and this circumstance has practical consequences for distance studies when money, material, or time are limited. |
---|---|
ISSN: | 1537-1719 0737-4038 1537-1719 |
DOI: | 10.1093/oxfordjournals.molbev.a040209 |