Fast search of third-order epistatic interactions on CPU and GPU clusters

Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a produ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The international journal of high performance computing applications 2020-01, Vol.34 (1), p.20-29
Hauptverfasser: Ponte-Fernández, Christian, González-Domínguez, Jorge, Martín, María J
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 29
container_issue 1
container_start_page 20
container_title The international journal of high performance computing applications
container_volume 34
creator Ponte-Fernández, Christian
González-Domínguez, Jorge
Martín, María J
description Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.
doi_str_mv 10.1177/1094342019852128
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2327884742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_1094342019852128</sage_id><sourcerecordid>2327884742</sourcerecordid><originalsourceid>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</originalsourceid><addsrcrecordid>eNp1UEtLAzEQDqJgrd49BjxHJ49NskcpthYKerDnJZuH3VI3NUkP_ntTKgiCp2-Y7zHDh9AthXtKlXqg0AouGNBWN4wyfYYmVAlKmBbyvM6VJkf-El3lvAUAKXgzQcu5yQVnb5Ld4Bhw2QzJkZicT9jvh1xMGSwexuKTsWWIY8ZxxLPXNTajw4uKdnfIlc3X6CKYXfY3PzhF6_nT2-yZrF4Wy9njilgObSG26VVo2fEBCM72QhmQtK6CsLbvnYSes9ZRsFRbGzx10mnnRCNVy7WTfIruTrn7FD8PPpduGw9prCc7xpnSWijBqgpOKptizsmHbp-GD5O-OgrdsbDub2HVQk6WbN79b-i_-m-Lr2mV</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2327884742</pqid></control><display><type>article</type><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><source>SAGE Complete</source><source>Alma/SFX Local Collection</source><creator>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</creator><creatorcontrib>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</creatorcontrib><description>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</description><identifier>ISSN: 1094-3420</identifier><identifier>EISSN: 1741-2846</identifier><identifier>DOI: 10.1177/1094342019852128</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Central processing units ; Clusters ; CPUs ; Genetic markers ; Genomes ; Graphics processing units ; Source code</subject><ispartof>The international journal of high performance computing applications, 2020-01, Vol.34 (1), p.20-29</ispartof><rights>The Author(s) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</citedby><cites>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</cites><orcidid>0000-0002-4728-6398 ; 0000-0002-9153-0909</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/1094342019852128$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/1094342019852128$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,776,780,21798,27901,27902,43597,43598</link.rule.ids></links><search><creatorcontrib>Ponte-Fernández, Christian</creatorcontrib><creatorcontrib>González-Domínguez, Jorge</creatorcontrib><creatorcontrib>Martín, María J</creatorcontrib><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><title>The international journal of high performance computing applications</title><description>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</description><subject>Central processing units</subject><subject>Clusters</subject><subject>CPUs</subject><subject>Genetic markers</subject><subject>Genomes</subject><subject>Graphics processing units</subject><subject>Source code</subject><issn>1094-3420</issn><issn>1741-2846</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1UEtLAzEQDqJgrd49BjxHJ49NskcpthYKerDnJZuH3VI3NUkP_ntTKgiCp2-Y7zHDh9AthXtKlXqg0AouGNBWN4wyfYYmVAlKmBbyvM6VJkf-El3lvAUAKXgzQcu5yQVnb5Ld4Bhw2QzJkZicT9jvh1xMGSwexuKTsWWIY8ZxxLPXNTajw4uKdnfIlc3X6CKYXfY3PzhF6_nT2-yZrF4Wy9njilgObSG26VVo2fEBCM72QhmQtK6CsLbvnYSes9ZRsFRbGzx10mnnRCNVy7WTfIruTrn7FD8PPpduGw9prCc7xpnSWijBqgpOKptizsmHbp-GD5O-OgrdsbDub2HVQk6WbN79b-i_-m-Lr2mV</recordid><startdate>202001</startdate><enddate>202001</enddate><creator>Ponte-Fernández, Christian</creator><creator>González-Domínguez, Jorge</creator><creator>Martín, María J</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-4728-6398</orcidid><orcidid>https://orcid.org/0000-0002-9153-0909</orcidid></search><sort><creationdate>202001</creationdate><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><author>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Central processing units</topic><topic>Clusters</topic><topic>CPUs</topic><topic>Genetic markers</topic><topic>Genomes</topic><topic>Graphics processing units</topic><topic>Source code</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ponte-Fernández, Christian</creatorcontrib><creatorcontrib>González-Domínguez, Jorge</creatorcontrib><creatorcontrib>Martín, María J</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The international journal of high performance computing applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ponte-Fernández, Christian</au><au>González-Domínguez, Jorge</au><au>Martín, María J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fast search of third-order epistatic interactions on CPU and GPU clusters</atitle><jtitle>The international journal of high performance computing applications</jtitle><date>2020-01</date><risdate>2020</risdate><volume>34</volume><issue>1</issue><spage>20</spage><epage>29</epage><pages>20-29</pages><issn>1094-3420</issn><eissn>1741-2846</eissn><abstract>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/1094342019852128</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4728-6398</orcidid><orcidid>https://orcid.org/0000-0002-9153-0909</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1094-3420
ispartof The international journal of high performance computing applications, 2020-01, Vol.34 (1), p.20-29
issn 1094-3420
1741-2846
language eng
recordid cdi_proquest_journals_2327884742
source SAGE Complete; Alma/SFX Local Collection
subjects Central processing units
Clusters
CPUs
Genetic markers
Genomes
Graphics processing units
Source code
title Fast search of third-order epistatic interactions on CPU and GPU clusters
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T08%3A05%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fast%20search%20of%20third-order%20epistatic%20interactions%20on%20CPU%20and%20GPU%20clusters&rft.jtitle=The%20international%20journal%20of%20high%20performance%20computing%20applications&rft.au=Ponte-Fern%C3%A1ndez,%20Christian&rft.date=2020-01&rft.volume=34&rft.issue=1&rft.spage=20&rft.epage=29&rft.pages=20-29&rft.issn=1094-3420&rft.eissn=1741-2846&rft_id=info:doi/10.1177/1094342019852128&rft_dat=%3Cproquest_cross%3E2327884742%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2327884742&rft_id=info:pmid/&rft_sage_id=10.1177_1094342019852128&rfr_iscdi=true