An Integrated Perspective on Phylogenetic Workflows
Molecular phylogenetics is the study of evolutionary relationships between biological sequences, often to infer the evolutionary relationships of organisms. These studies require many analysis components, including sequence assembly, identification of homologous sequences, gene tree inference, and s...
Gespeichert in:
Veröffentlicht in: | Trends in ecology & evolution (Amsterdam) 2016-02, Vol.31 (2), p.116-126 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Molecular phylogenetics is the study of evolutionary relationships between biological sequences, often to infer the evolutionary relationships of organisms. These studies require many analysis components, including sequence assembly, identification of homologous sequences, gene tree inference, and species tree inference. At present, each component is usually treated as a single step in a linear analysis, where the output of each component is passed as input to the next as a point estimate. Here we outline a generative model that helps clarify assumptions that are implicit to phylogenetic workflows, focusing on the assumption of low relative entropy. This perspective unifies currently disparate advances, and will help investigators evaluate which steps would benefit the most from additional computation and future methods development.
Current phylogenetic analyses are implemented as multistep, linear workflows where intermediate analysis steps generate and pass on point estimates of unobserved variables. This linear structure and minimal information communication strategy embody three implicit assumptions: (i) the order of the analysis steps is biologically justified, (ii) a Markovian dependency structure, and (iii) low relative entropy of results of each analysis step.
There is evidence that these assumptions, in particular low relative entropy, are frequently violated in empirical studies with potential detrimental effects in phylogenetic analyses.
A generative model and probabilistic framework provide a unified perspective to assess the costs and benefits of relaxing these assumptions, help identify what methods and tools are missing, and provide a context for evaluating priorities for future development. |
---|---|
ISSN: | 0169-5347 1872-8383 |
DOI: | 10.1016/j.tree.2015.12.007 |