Reduced space sequence alignment

Motivation: Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 1997-02, Vol.13 (1), p.45-53
Hauptverfasser: Grice, J.Alicia, Hughey, Richard, Speck, Don
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version. Results: This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15–40% faster than the O(n)-space divide-and-conquer algorithm. Availability: The methods will soon be incorporated in the SAM hidden Markov modeling package http: //www.cse.ucs-c.edu/research/compbio/sam.html. Contact: wzrph@cse.ucsc.edu
ISSN:1367-4803
0266-7061
1460-2059
DOI:10.1093/bioinformatics/13.1.45