Empirical Transition Probability Indexing Sparse-Coding Belief Propagation (ETPI-SCoBeP) Genome Sequence Alignment

The advance in human genome sequencing technology has significantly reduced the cost of data generation and overwhelms the computing capability of sequence analysis. Efficiency, efficacy, and scalability remain challenging in sequence alignment, which is an important and foundational operation for g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cancer informatics 2014-01, Vol.2014 (Suppl. 1), p.159-165
Hauptverfasser: Roozgard, Aminmohammad, Barzigar, Nafise, Wang, Shuang, Jiang, Xiaoqian, Cheng, Samuel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The advance in human genome sequencing technology has significantly reduced the cost of data generation and overwhelms the computing capability of sequence analysis. Efficiency, efficacy, and scalability remain challenging in sequence alignment, which is an important and foundational operation for genome data analysis. In this paper, we propose a two-stage approach to tackle this problem. In the preprocessing step, we match blocks of reference and target sequences based on the similarities between their empirical transition probability distributions using belief propagation. We then conduct a refined match using our recently published sparse-coding belief propagation (SCoBeP) technique. Our experimental results demonstrated robustness in nucleotide sequence alignment, and our results are competitive to those of the SOAP aligner and the BWA algorithm. Moreover, compared to SCoBeP alignment, the proposed technique can handle sequences of much longer lengths.
ISSN:1176-9351
1176-9351
DOI:10.4137/CIN.S13887