Fast search of third-order epistatic interactions on CPU and GPU clusters
Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a produ...
Gespeichert in:
Veröffentlicht in: | The international journal of high performance computing applications 2020-01, Vol.34 (1), p.20-29 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 29 |
---|---|
container_issue | 1 |
container_start_page | 20 |
container_title | The international journal of high performance computing applications |
container_volume | 34 |
creator | Ponte-Fernández, Christian González-Domínguez, Jorge Martín, María J |
description | Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp. |
doi_str_mv | 10.1177/1094342019852128 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2327884742</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_1094342019852128</sage_id><sourcerecordid>2327884742</sourcerecordid><originalsourceid>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</originalsourceid><addsrcrecordid>eNp1UEtLAzEQDqJgrd49BjxHJ49NskcpthYKerDnJZuH3VI3NUkP_ntTKgiCp2-Y7zHDh9AthXtKlXqg0AouGNBWN4wyfYYmVAlKmBbyvM6VJkf-El3lvAUAKXgzQcu5yQVnb5Ld4Bhw2QzJkZicT9jvh1xMGSwexuKTsWWIY8ZxxLPXNTajw4uKdnfIlc3X6CKYXfY3PzhF6_nT2-yZrF4Wy9njilgObSG26VVo2fEBCM72QhmQtK6CsLbvnYSes9ZRsFRbGzx10mnnRCNVy7WTfIruTrn7FD8PPpduGw9prCc7xpnSWijBqgpOKptizsmHbp-GD5O-OgrdsbDub2HVQk6WbN79b-i_-m-Lr2mV</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2327884742</pqid></control><display><type>article</type><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><source>SAGE Complete</source><source>Alma/SFX Local Collection</source><creator>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</creator><creatorcontrib>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</creatorcontrib><description>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</description><identifier>ISSN: 1094-3420</identifier><identifier>EISSN: 1741-2846</identifier><identifier>DOI: 10.1177/1094342019852128</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Central processing units ; Clusters ; CPUs ; Genetic markers ; Genomes ; Graphics processing units ; Source code</subject><ispartof>The international journal of high performance computing applications, 2020-01, Vol.34 (1), p.20-29</ispartof><rights>The Author(s) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</citedby><cites>FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</cites><orcidid>0000-0002-4728-6398 ; 0000-0002-9153-0909</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/1094342019852128$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/1094342019852128$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,776,780,21798,27901,27902,43597,43598</link.rule.ids></links><search><creatorcontrib>Ponte-Fernández, Christian</creatorcontrib><creatorcontrib>González-Domínguez, Jorge</creatorcontrib><creatorcontrib>Martín, María J</creatorcontrib><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><title>The international journal of high performance computing applications</title><description>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</description><subject>Central processing units</subject><subject>Clusters</subject><subject>CPUs</subject><subject>Genetic markers</subject><subject>Genomes</subject><subject>Graphics processing units</subject><subject>Source code</subject><issn>1094-3420</issn><issn>1741-2846</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1UEtLAzEQDqJgrd49BjxHJ49NskcpthYKerDnJZuH3VI3NUkP_ntTKgiCp2-Y7zHDh9AthXtKlXqg0AouGNBWN4wyfYYmVAlKmBbyvM6VJkf-El3lvAUAKXgzQcu5yQVnb5Ld4Bhw2QzJkZicT9jvh1xMGSwexuKTsWWIY8ZxxLPXNTajw4uKdnfIlc3X6CKYXfY3PzhF6_nT2-yZrF4Wy9njilgObSG26VVo2fEBCM72QhmQtK6CsLbvnYSes9ZRsFRbGzx10mnnRCNVy7WTfIruTrn7FD8PPpduGw9prCc7xpnSWijBqgpOKptizsmHbp-GD5O-OgrdsbDub2HVQk6WbN79b-i_-m-Lr2mV</recordid><startdate>202001</startdate><enddate>202001</enddate><creator>Ponte-Fernández, Christian</creator><creator>González-Domínguez, Jorge</creator><creator>Martín, María J</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-4728-6398</orcidid><orcidid>https://orcid.org/0000-0002-9153-0909</orcidid></search><sort><creationdate>202001</creationdate><title>Fast search of third-order epistatic interactions on CPU and GPU clusters</title><author>Ponte-Fernández, Christian ; González-Domínguez, Jorge ; Martín, María J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c309t-c5b7f9200060fdcb47a0617f9f4ccbbd60b329d10c18ccfe1d6d8dd4567938d63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Central processing units</topic><topic>Clusters</topic><topic>CPUs</topic><topic>Genetic markers</topic><topic>Genomes</topic><topic>Graphics processing units</topic><topic>Source code</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ponte-Fernández, Christian</creatorcontrib><creatorcontrib>González-Domínguez, Jorge</creatorcontrib><creatorcontrib>Martín, María J</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The international journal of high performance computing applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ponte-Fernández, Christian</au><au>González-Domínguez, Jorge</au><au>Martín, María J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fast search of third-order epistatic interactions on CPU and GPU clusters</atitle><jtitle>The international journal of high performance computing applications</jtitle><date>2020-01</date><risdate>2020</risdate><volume>34</volume><issue>1</issue><spage>20</spage><epage>29</epage><pages>20-29</pages><issn>1094-3420</issn><eissn>1741-2846</eissn><abstract>Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between phenotypes and genotypes are not easy to identify, as most of the phenotypes are a product of the interaction between multiple genes, a phenomenon known as epistasis. Many authors have resorted to different approaches and hardware architectures in order to mitigate the exponential time complexity of the problem. However, these studies make some compromises in order to keep a reasonable execution time, such as limiting the number of genetic markers involved in the interaction, or discarding some of these markers in an initial filtering stage. This work presents MPI3SNP, a tool that implements a three-way exhaustive search for cluster architectures with the aim of mitigating the exponential growth of the run-time. Modern cluster solutions usually incorporate GPUs. Thus, MPI3SNP includes implementations for both multi-CPU and multi-GPU clusters. To contextualize the performance achieved, MPI3SNP is able to analyze an input of 6300 genetic markers and 3200 samples in less than 6 min using 768 CPU cores or 4 min using 8 NVIDIA K80 GPUs. The source code is available at https://github.com/chponte/mpi3snp.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/1094342019852128</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4728-6398</orcidid><orcidid>https://orcid.org/0000-0002-9153-0909</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1094-3420 |
ispartof | The international journal of high performance computing applications, 2020-01, Vol.34 (1), p.20-29 |
issn | 1094-3420 1741-2846 |
language | eng |
recordid | cdi_proquest_journals_2327884742 |
source | SAGE Complete; Alma/SFX Local Collection |
subjects | Central processing units Clusters CPUs Genetic markers Genomes Graphics processing units Source code |
title | Fast search of third-order epistatic interactions on CPU and GPU clusters |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T08%3A05%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fast%20search%20of%20third-order%20epistatic%20interactions%20on%20CPU%20and%20GPU%20clusters&rft.jtitle=The%20international%20journal%20of%20high%20performance%20computing%20applications&rft.au=Ponte-Fern%C3%A1ndez,%20Christian&rft.date=2020-01&rft.volume=34&rft.issue=1&rft.spage=20&rft.epage=29&rft.pages=20-29&rft.issn=1094-3420&rft.eissn=1741-2846&rft_id=info:doi/10.1177/1094342019852128&rft_dat=%3Cproquest_cross%3E2327884742%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2327884742&rft_id=info:pmid/&rft_sage_id=10.1177_1094342019852128&rfr_iscdi=true |