New Approaches for Inferring Phylogenies in the Presence of Paralogs

The availability of whole genome sequences was expected to supply essentially unlimited data for phylogenetics. However, strict reliance on single-copy genes for this purpose has drastically limited the amount of data that can be used. Here, we review several approaches for increasing the amount of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Trends in genetics 2021-02, Vol.37 (2), p.174-187
Hauptverfasser: Smith, Megan L., Hahn, Matthew W.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The availability of whole genome sequences was expected to supply essentially unlimited data for phylogenetics. However, strict reliance on single-copy genes for this purpose has drastically limited the amount of data that can be used. Here, we review several approaches for increasing the amount of data used for phylogenetic inference, focusing on methods that allow for the inclusion of duplicated genes (paralogs). Recently developed methods that are robust to high levels of incomplete lineage sorting also appear to be robust to the inclusion of paralogs, suggesting a promising way to take full advantage of genomic data. We discuss the pitfalls of these approaches, as well as further avenues for research. Despite the increased availability of whole genome sequences, the data available for phylogenetic studies are extremely limited. This is because only single-copy genes present in most sampled species are used to infer phylogenies. In this review, we discuss several approaches for increasing the amount of data that can be used in phylogenetic inference.Recent work suggests that the inclusion of loci missing data for some taxa should not mislead phylogenetic inference with several popular methods.Even if orthologs are required, researchers need not limit themselves to single-copy orthologs, as paralogs specific to a single lineage or to a pair of sister lineages should not lead to topological errors in any approach to phylogeny inference.Several recent methods for species-tree inference that are robust to high levels of incomplete lineage sorting also appear to be robust to the inclusion of paralogs.
ISSN:0168-9525
DOI:10.1016/j.tig.2020.08.012