Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU

Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence align...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Burchard, Luk, Zhao, Max Xiaohang, Langguth, Johannes, Buluç, Aydın, Guidi, Giulia
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Distributed, Parallel, and Cluster Computing Quantitative Biology - Genomics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence alignment problem is fundamental in bioinformatics; we have implemented the $X$-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The $X$-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing. Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves $10\times$ speedup over a state-of-the-art GPU implementation and up to $4.65\times$ compared to CPU. In addition, we introduce a memory-restricted $X$-Drop algorithm that reduces memory footprint by $55\times$ and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by $3.6\times$.
DOI:	10.48550/arxiv.2304.08662