Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU

Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence align...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-04
Hauptverfasser: Burchard, Luk, Max Xiaohang Zhao, Langguth, Johannes, Buluç, Aydın, Guidi, Giulia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Burchard, Luk
Max Xiaohang Zhao
Langguth, Johannes
Buluç, Aydın
Guidi, Giulia
description Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence alignment problem is fundamental in bioinformatics; we have implemented the \(X\)-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The \(X\)-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing. Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves \(10\times\) speedup over a state-of-the-art GPU implementation and up to \(4.65\times\) compared to CPU. In addition, we introduce a memory-restricted \(X\)-Drop algorithm that reduces memory footprint by \(55\times\) and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by \(3.6\times\).
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2803091063</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2803091063</sourcerecordid><originalsourceid>FETCH-proquest_journals_28030910633</originalsourceid><addsrcrecordid>eNqNjd8KgjAcRkcQJOU7_KDrwdrSrDsz-3MRRBZ4J2JTJ7qtTd8_gx6gqwPnfPBNkEMZW-FgTekMudY2hBDqb6jnMQelic4LDnFZikJw2UPC3wOXowpbUcnuq0plILmHV7zPLX9BpDo99EJWO0jxwSgNSkJfcziZXNeFMhwut-cCTcu8tdz9cY6Wx_gRnbE2anywfdaowcgxZTQgjGxXxGfsv9UHtJJABQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2803091063</pqid></control><display><type>article</type><title>Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU</title><source>Free E- Journals</source><creator>Burchard, Luk ; Max Xiaohang Zhao ; Langguth, Johannes ; Buluç, Aydın ; Guidi, Giulia</creator><creatorcontrib>Burchard, Luk ; Max Xiaohang Zhao ; Langguth, Johannes ; Buluç, Aydın ; Guidi, Giulia</creatorcontrib><description>Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence alignment problem is fundamental in bioinformatics; we have implemented the \(X\)-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The \(X\)-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing. Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves \(10\times\) speedup over a state-of-the-art GPU implementation and up to \(4.65\times\) compared to CPU. In addition, we introduce a memory-restricted \(X\)-Drop algorithm that reduces memory footprint by \(55\times\) and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by \(3.6\times\).</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Alignment ; Bioinformatics ; Computation ; Computer architecture ; Computer memory ; Heuristic methods ; Load balancing ; Microprocessors ; Optimization</subject><ispartof>arXiv.org, 2023-04</ispartof><rights>2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Burchard, Luk</creatorcontrib><creatorcontrib>Max Xiaohang Zhao</creatorcontrib><creatorcontrib>Langguth, Johannes</creatorcontrib><creatorcontrib>Buluç, Aydın</creatorcontrib><creatorcontrib>Guidi, Giulia</creatorcontrib><title>Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU</title><title>arXiv.org</title><description>Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence alignment problem is fundamental in bioinformatics; we have implemented the \(X\)-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The \(X\)-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing. Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves \(10\times\) speedup over a state-of-the-art GPU implementation and up to \(4.65\times\) compared to CPU. In addition, we introduce a memory-restricted \(X\)-Drop algorithm that reduces memory footprint by \(55\times\) and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by \(3.6\times\).</description><subject>Algorithms</subject><subject>Alignment</subject><subject>Bioinformatics</subject><subject>Computation</subject><subject>Computer architecture</subject><subject>Computer memory</subject><subject>Heuristic methods</subject><subject>Load balancing</subject><subject>Microprocessors</subject><subject>Optimization</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjd8KgjAcRkcQJOU7_KDrwdrSrDsz-3MRRBZ4J2JTJ7qtTd8_gx6gqwPnfPBNkEMZW-FgTekMudY2hBDqb6jnMQelic4LDnFZikJw2UPC3wOXowpbUcnuq0plILmHV7zPLX9BpDo99EJWO0jxwSgNSkJfcziZXNeFMhwut-cCTcu8tdz9cY6Wx_gRnbE2anywfdaowcgxZTQgjGxXxGfsv9UHtJJABQ</recordid><startdate>20230417</startdate><enddate>20230417</enddate><creator>Burchard, Luk</creator><creator>Max Xiaohang Zhao</creator><creator>Langguth, Johannes</creator><creator>Buluç, Aydın</creator><creator>Guidi, Giulia</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230417</creationdate><title>Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU</title><author>Burchard, Luk ; Max Xiaohang Zhao ; Langguth, Johannes ; Buluç, Aydın ; Guidi, Giulia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28030910633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Alignment</topic><topic>Bioinformatics</topic><topic>Computation</topic><topic>Computer architecture</topic><topic>Computer memory</topic><topic>Heuristic methods</topic><topic>Load balancing</topic><topic>Microprocessors</topic><topic>Optimization</topic><toplevel>online_resources</toplevel><creatorcontrib>Burchard, Luk</creatorcontrib><creatorcontrib>Max Xiaohang Zhao</creatorcontrib><creatorcontrib>Langguth, Johannes</creatorcontrib><creatorcontrib>Buluç, Aydın</creatorcontrib><creatorcontrib>Guidi, Giulia</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Burchard, Luk</au><au>Max Xiaohang Zhao</au><au>Langguth, Johannes</au><au>Buluç, Aydın</au><au>Guidi, Giulia</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU</atitle><jtitle>arXiv.org</jtitle><date>2023-04-17</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Dedicated accelerator hardware has become essential for processing AI-based workloads, leading to the rise of novel accelerator architectures. Furthermore, fundamental differences in memory architecture and parallelism have made these accelerators targets for scientific computing. The sequence alignment problem is fundamental in bioinformatics; we have implemented the \(X\)-Drop algorithm, a heuristic method for pairwise alignment that reduces search space, on the Graphcore Intelligence Processor Unit (IPU) accelerator. The \(X\)-Drop algorithm has an irregular computational pattern, which makes it difficult to accelerate due to load balancing. Here, we introduce a graph-based partitioning and queue-based batch system to improve load balancing. Our implementation achieves \(10\times\) speedup over a state-of-the-art GPU implementation and up to \(4.65\times\) compared to CPU. In addition, we introduce a memory-restricted \(X\)-Drop algorithm that reduces memory footprint by \(55\times\) and efficiently uses the IPU's limited low-latency SRAM. This optimization further improves the strong scaling performance by \(3.6\times\).</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-04
issn 2331-8422
language eng
recordid cdi_proquest_journals_2803091063
source Free E- Journals
subjects Algorithms
Alignment
Bioinformatics
Computation
Computer architecture
Computer memory
Heuristic methods
Load balancing
Microprocessors
Optimization
title Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T03%3A37%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Space%20Efficient%20Sequence%20Alignment%20for%20SRAM-Based%20Computing:%20X-Drop%20on%20the%20Graphcore%20IPU&rft.jtitle=arXiv.org&rft.au=Burchard,%20Luk&rft.date=2023-04-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2803091063%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2803091063&rft_id=info:pmid/&rfr_iscdi=true