Efficient implied alignment

Background: Given a binary tree T of n leaves, each leaf labeled by a string of length at most k, and a binary string alignment function circle times, an implied alignment can be generated to describe the alignment of a dynamic homology for T. This is done by first decorating each node of T with an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	BMC bioinformatics 2020-07, Vol.21 (1), p.296-296, Article 296
Hauptverfasser:	Washburn, Alex J., Wheeler, Ward C.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Alignment Biochemical Research Methods Biochemistry & Molecular Biology Biotechnology & Applied Microbiology Complexity Dynamic homology Heuristic Homology Implied alignment Insertion Leaves Life Sciences & Biomedicine Mathematical & Computational Biology Methodology Multiple string alignment Phylogenetics Phylogeny Run time (computers) Science & Technology Sequence alignment Sequence Alignment - methods Strings Tree alignment
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background: Given a binary tree T of n leaves, each leaf labeled by a string of length at most k, and a binary string alignment function circle times, an implied alignment can be generated to describe the alignment of a dynamic homology for T. This is done by first decorating each node of T with an alignment context using circle times, in a post-order traversal, then, during a subsequent pre-order traversal, inferring on which edges insertion and deletion events occurred using those internal node decorations. Results: Previous descriptions of the implied alignment algorithm suggest a technique of "back-propagation" with time complexity O (k(2) * n(2)). Here we describe an implied alignment algorithm with complexity O (k * n(2)). For well-behaved data, such as molecular sequences, the runtime approaches the best-case complexity of Omega(k * n). Conclusions: The reduction in the time complexity of the algorithm dramatically improves both its utility in generating multiple sequence alignments and its heuristic utility.
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-020-03595-2