GPU Acceleration of Pyrosequencing Noise Removal

Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise r...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang Gao, Bakos, J. D.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Amplicon Noise CUDA GPU GPU Computing Graphics processing unit Heterogeneous Computing Instruction sets Kernel Memory management Metagenomics MPI Needleman-Wunsch Optimization Pyronoise Registers Sequence Alignment Short Reads Smith-Waterman Throughput
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	101
container_issue
container_start_page	94
container_title
container_volume
creator	Yang Gao Bakos, J. D.
description	Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n 2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.
doi_str_mv	10.1109/SAAHPC.2012.15
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6319195</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6319195</ieee_id><sourcerecordid>6319195</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-24cd50cf96c9393a8a8989e46f5feebdd46e8d63dfa92efd04ebe711cc11ab073</originalsourceid><addsrcrecordid>eNo9jMtKw0AUQMcX2NZu3bjJDyTOnfcsQ9C2UDSoBXdlMnNHRtJEkyr07xWUns1ZHDiEXAMtAKi9fS7LZV0VjAIrQJ6QKdXKSmG4kadkwkCpXIJ8PSNTEEpzZgyj58fA-SWZj-M7_cWABmYmhC7qTVZ6jy0Obp_6LutjVh-GfsTPL-x86t6yhz6NmD3hrv927RW5iK4dcf7vGdnc371Uy3z9uFhV5TpPoOU-Z8IHSX20yltuuTPOWGNRqCgjYhOCUGiC4iE6yzAGKrBBDeA9gGuo5jNy8_dNiLj9GNLODYet4mDBSv4DXtZH-Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>GPU Acceleration of Pyrosequencing Noise Removal</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Yang Gao ; Bakos, J. D.</creator><creatorcontrib>Yang Gao ; Bakos, J. D.</creatorcontrib><description>Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n 2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.</description><identifier>ISSN: 2166-5133</identifier><identifier>ISBN: 1467328820</identifier><identifier>ISBN: 9781467328821</identifier><identifier>EISSN: 2166-515X</identifier><identifier>EISBN: 0769548385</identifier><identifier>EISBN: 9780769548388</identifier><identifier>DOI: 10.1109/SAAHPC.2012.15</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Amplicon Noise ; CUDA ; GPU ; GPU Computing ; Graphics processing unit ; Heterogeneous Computing ; Instruction sets ; Kernel ; Memory management ; Metagenomics ; MPI ; Needleman-Wunsch ; Optimization ; Pyronoise ; Registers ; Sequence Alignment ; Short Reads ; Smith-Waterman ; Throughput</subject><ispartof>2012 Symposium on Application Accelerators in High Performance Computing, 2012, p.94-101</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6319195$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6319195$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yang Gao</creatorcontrib><creatorcontrib>Bakos, J. D.</creatorcontrib><title>GPU Acceleration of Pyrosequencing Noise Removal</title><title>2012 Symposium on Application Accelerators in High Performance Computing</title><addtitle>saahpc</addtitle><description>Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n 2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.</description><subject>Amplicon Noise</subject><subject>CUDA</subject><subject>GPU</subject><subject>GPU Computing</subject><subject>Graphics processing unit</subject><subject>Heterogeneous Computing</subject><subject>Instruction sets</subject><subject>Kernel</subject><subject>Memory management</subject><subject>Metagenomics</subject><subject>MPI</subject><subject>Needleman-Wunsch</subject><subject>Optimization</subject><subject>Pyronoise</subject><subject>Registers</subject><subject>Sequence Alignment</subject><subject>Short Reads</subject><subject>Smith-Waterman</subject><subject>Throughput</subject><issn>2166-5133</issn><issn>2166-515X</issn><isbn>1467328820</isbn><isbn>9781467328821</isbn><isbn>0769548385</isbn><isbn>9780769548388</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo9jMtKw0AUQMcX2NZu3bjJDyTOnfcsQ9C2UDSoBXdlMnNHRtJEkyr07xWUns1ZHDiEXAMtAKi9fS7LZV0VjAIrQJ6QKdXKSmG4kadkwkCpXIJ8PSNTEEpzZgyj58fA-SWZj-M7_cWABmYmhC7qTVZ6jy0Obp_6LutjVh-GfsTPL-x86t6yhz6NmD3hrv927RW5iK4dcf7vGdnc371Uy3z9uFhV5TpPoOU-Z8IHSX20yltuuTPOWGNRqCgjYhOCUGiC4iE6yzAGKrBBDeA9gGuo5jNy8_dNiLj9GNLODYet4mDBSv4DXtZH-Q</recordid><startdate>201207</startdate><enddate>201207</enddate><creator>Yang Gao</creator><creator>Bakos, J. D.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201207</creationdate><title>GPU Acceleration of Pyrosequencing Noise Removal</title><author>Yang Gao ; Bakos, J. D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-24cd50cf96c9393a8a8989e46f5feebdd46e8d63dfa92efd04ebe711cc11ab073</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Amplicon Noise</topic><topic>CUDA</topic><topic>GPU</topic><topic>GPU Computing</topic><topic>Graphics processing unit</topic><topic>Heterogeneous Computing</topic><topic>Instruction sets</topic><topic>Kernel</topic><topic>Memory management</topic><topic>Metagenomics</topic><topic>MPI</topic><topic>Needleman-Wunsch</topic><topic>Optimization</topic><topic>Pyronoise</topic><topic>Registers</topic><topic>Sequence Alignment</topic><topic>Short Reads</topic><topic>Smith-Waterman</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang Gao</creatorcontrib><creatorcontrib>Bakos, J. D.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang Gao</au><au>Bakos, J. D.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>GPU Acceleration of Pyrosequencing Noise Removal</atitle><btitle>2012 Symposium on Application Accelerators in High Performance Computing</btitle><stitle>saahpc</stitle><date>2012-07</date><risdate>2012</risdate><spage>94</spage><epage>101</epage><pages>94-101</pages><issn>2166-5133</issn><eissn>2166-515X</eissn><isbn>1467328820</isbn><isbn>9781467328821</isbn><eisbn>0769548385</eisbn><eisbn>9780769548388</eisbn><coden>IEEPAD</coden><abstract>Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n 2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.</abstract><pub>IEEE</pub><doi>10.1109/SAAHPC.2012.15</doi><tpages>8</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2166-5133
ispartof	2012 Symposium on Application Accelerators in High Performance Computing, 2012, p.94-101
issn	2166-5133 2166-515X
language	eng
recordid	cdi_ieee_primary_6319195
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Amplicon Noise CUDA GPU GPU Computing Graphics processing unit Heterogeneous Computing Instruction sets Kernel Memory management Metagenomics MPI Needleman-Wunsch Optimization Pyronoise Registers Sequence Alignment Short Reads Smith-Waterman Throughput
title	GPU Acceleration of Pyrosequencing Noise Removal
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T01%3A53%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=GPU%20Acceleration%20of%20Pyrosequencing%20Noise%20Removal&rft.btitle=2012%20Symposium%20on%20Application%20Accelerators%20in%20High%20Performance%20Computing&rft.au=Yang%20Gao&rft.date=2012-07&rft.spage=94&rft.epage=101&rft.pages=94-101&rft.issn=2166-5133&rft.eissn=2166-515X&rft.isbn=1467328820&rft.isbn_list=9781467328821&rft.coden=IEEPAD&rft_id=info:doi/10.1109/SAAHPC.2012.15&rft_dat=%3Cieee_6IE%3E6319195%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=0769548385&rft.eisbn_list=9780769548388&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6319195&rfr_iscdi=true