High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analys...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Health information science and systems 2015-02, Vol.3 (Suppl 1), p.S3-S3, Article S3
Hauptverfasser: Goudey, Benjamin, Abedini, Mani, Hopper, John L, Inouye, Michael, Makalic, Enes, Schmidt, Daniel F, Wagner, John, Zhou, Zeyu, Zobel, Justin, Reumann, Matthias
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page S3
container_issue Suppl 1
container_start_page S3
container_title Health information science and systems
container_volume 3
creator Goudey, Benjamin
Abedini, Mani
Hopper, John L
Inouye, Michael
Makalic, Enes
Schmidt, Daniel F
Wagner, John
Zhou, Zeyu
Zobel, Justin
Reumann, Matthias
description Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
doi_str_mv 10.1186/2047-2501-3-S1-S3
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4383059</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A541652374</galeid><sourcerecordid>A541652374</sourcerecordid><originalsourceid>FETCH-LOGICAL-b500t-6d3e38775b75c51ed5f3e5c218d92ae64f7606340cbeb1254339d4b75cbb34083</originalsourceid><addsrcrecordid>eNp9klFr3SAUx8PYaEvXD9CXIexlL-k0xpj7Mrh0XTso7CEbexRjThKL0UyTsvtB-n1nmu7SC90U9HDO7_w96kmSc4IvCCmLjxnOeZoxTFKaViSt6KvkZO97_cw-Ts5CuMNxbEhGGTlKjjNWcsxZeZI83OiuRyP41vlBWgVIuWGcJ207BFbW5tH43cs5TPoekLTS7IIOyLWoj6ngkfNNXEMEDSA7KwNu0g2g0Znd4PzY6zAgbSfwUk3a2Wija7BuAPRz4bYhOKXlY6ia5kZDeJu8aaUJcPa0nyY_vlx9v7xJb79df73c3qY1w3hKi4YCLTlnNWeKEWhYS4GpjJTNJpNQ5C0vcEFzrGqoScZySjdNvsB1Hb0lPU0-rbrjXA_QKLCTl0aMXg_S74STWhxGrO5F5-5FTkuK2SYKfF4Fau3-IXAYia8rlp8Ry88IKioiKhplPjzV4d2vGcIkBh0UGCMtuDkIUnBKOeN4OfH9inbSgNC2dVFXLbjYspwULKM8j9TFC1ScDQxaOQutjv6DBLImKO9C8NDu70CwWPrtxarfPX-9fcbf7opAtgIhhmwHXty52ccOCv9R_QOfq-SV</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1673375709</pqid></control><display><type>article</type><title>High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies</title><source>SpringerNature Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Goudey, Benjamin ; Abedini, Mani ; Hopper, John L ; Inouye, Michael ; Makalic, Enes ; Schmidt, Daniel F ; Wagner, John ; Zhou, Zeyu ; Zobel, Justin ; Reumann, Matthias</creator><creatorcontrib>Goudey, Benjamin ; Abedini, Mani ; Hopper, John L ; Inouye, Michael ; Makalic, Enes ; Schmidt, Daniel F ; Wagner, John ; Zhou, Zeyu ; Zobel, Justin ; Reumann, Matthias</creatorcontrib><description>Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.</description><identifier>ISSN: 2047-2501</identifier><identifier>EISSN: 2047-2501</identifier><identifier>DOI: 10.1186/2047-2501-3-S1-S3</identifier><identifier>PMID: 25870758</identifier><language>eng</language><publisher>London: BioMed Central</publisher><subject>Analysis ; Bioinformatics ; Chromosomes ; Computational Biology/Bioinformatics ; Computer industry ; Computer Science ; Genetic aspects ; Genomes ; Genomics ; Health Informatics ; Information Systems and Communication Service ; Innovations ; Methods ; Single nucleotide polymorphisms ; Supercomputers</subject><ispartof>Health information science and systems, 2015-02, Vol.3 (Suppl 1), p.S3-S3, Article S3</ispartof><rights>Goudey et al.; licensee BioMed Central Ltd. 2015</rights><rights>COPYRIGHT 2015 BioMed Central Ltd.</rights><rights>Copyright © 2015 Goudey et al.; licensee BioMed Central Ltd. 2015 Goudey et al.; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b500t-6d3e38775b75c51ed5f3e5c218d92ae64f7606340cbeb1254339d4b75cbb34083</citedby><cites>FETCH-LOGICAL-b500t-6d3e38775b75c51ed5f3e5c218d92ae64f7606340cbeb1254339d4b75cbb34083</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383059/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383059/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,41488,42557,51319,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25870758$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Goudey, Benjamin</creatorcontrib><creatorcontrib>Abedini, Mani</creatorcontrib><creatorcontrib>Hopper, John L</creatorcontrib><creatorcontrib>Inouye, Michael</creatorcontrib><creatorcontrib>Makalic, Enes</creatorcontrib><creatorcontrib>Schmidt, Daniel F</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Zhou, Zeyu</creatorcontrib><creatorcontrib>Zobel, Justin</creatorcontrib><creatorcontrib>Reumann, Matthias</creatorcontrib><title>High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies</title><title>Health information science and systems</title><addtitle>Health Inf Sci Syst</addtitle><addtitle>Health Inf Sci Syst</addtitle><description>Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.</description><subject>Analysis</subject><subject>Bioinformatics</subject><subject>Chromosomes</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computer industry</subject><subject>Computer Science</subject><subject>Genetic aspects</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Health Informatics</subject><subject>Information Systems and Communication Service</subject><subject>Innovations</subject><subject>Methods</subject><subject>Single nucleotide polymorphisms</subject><subject>Supercomputers</subject><issn>2047-2501</issn><issn>2047-2501</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9klFr3SAUx8PYaEvXD9CXIexlL-k0xpj7Mrh0XTso7CEbexRjThKL0UyTsvtB-n1nmu7SC90U9HDO7_w96kmSc4IvCCmLjxnOeZoxTFKaViSt6KvkZO97_cw-Ts5CuMNxbEhGGTlKjjNWcsxZeZI83OiuRyP41vlBWgVIuWGcJ207BFbW5tH43cs5TPoekLTS7IIOyLWoj6ngkfNNXEMEDSA7KwNu0g2g0Znd4PzY6zAgbSfwUk3a2Wija7BuAPRz4bYhOKXlY6ia5kZDeJu8aaUJcPa0nyY_vlx9v7xJb79df73c3qY1w3hKi4YCLTlnNWeKEWhYS4GpjJTNJpNQ5C0vcEFzrGqoScZySjdNvsB1Hb0lPU0-rbrjXA_QKLCTl0aMXg_S74STWhxGrO5F5-5FTkuK2SYKfF4Fau3-IXAYia8rlp8Ry88IKioiKhplPjzV4d2vGcIkBh0UGCMtuDkIUnBKOeN4OfH9inbSgNC2dVFXLbjYspwULKM8j9TFC1ScDQxaOQutjv6DBLImKO9C8NDu70CwWPrtxarfPX-9fcbf7opAtgIhhmwHXty52ccOCv9R_QOfq-SV</recordid><startdate>20150224</startdate><enddate>20150224</enddate><creator>Goudey, Benjamin</creator><creator>Abedini, Mani</creator><creator>Hopper, John L</creator><creator>Inouye, Michael</creator><creator>Makalic, Enes</creator><creator>Schmidt, Daniel F</creator><creator>Wagner, John</creator><creator>Zhou, Zeyu</creator><creator>Zobel, Justin</creator><creator>Reumann, Matthias</creator><general>BioMed Central</general><general>BioMed Central Ltd</general><scope>C6C</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20150224</creationdate><title>High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies</title><author>Goudey, Benjamin ; Abedini, Mani ; Hopper, John L ; Inouye, Michael ; Makalic, Enes ; Schmidt, Daniel F ; Wagner, John ; Zhou, Zeyu ; Zobel, Justin ; Reumann, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b500t-6d3e38775b75c51ed5f3e5c218d92ae64f7606340cbeb1254339d4b75cbb34083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Analysis</topic><topic>Bioinformatics</topic><topic>Chromosomes</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computer industry</topic><topic>Computer Science</topic><topic>Genetic aspects</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Health Informatics</topic><topic>Information Systems and Communication Service</topic><topic>Innovations</topic><topic>Methods</topic><topic>Single nucleotide polymorphisms</topic><topic>Supercomputers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Goudey, Benjamin</creatorcontrib><creatorcontrib>Abedini, Mani</creatorcontrib><creatorcontrib>Hopper, John L</creatorcontrib><creatorcontrib>Inouye, Michael</creatorcontrib><creatorcontrib>Makalic, Enes</creatorcontrib><creatorcontrib>Schmidt, Daniel F</creatorcontrib><creatorcontrib>Wagner, John</creatorcontrib><creatorcontrib>Zhou, Zeyu</creatorcontrib><creatorcontrib>Zobel, Justin</creatorcontrib><creatorcontrib>Reumann, Matthias</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Health information science and systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Goudey, Benjamin</au><au>Abedini, Mani</au><au>Hopper, John L</au><au>Inouye, Michael</au><au>Makalic, Enes</au><au>Schmidt, Daniel F</au><au>Wagner, John</au><au>Zhou, Zeyu</au><au>Zobel, Justin</au><au>Reumann, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies</atitle><jtitle>Health information science and systems</jtitle><stitle>Health Inf Sci Syst</stitle><addtitle>Health Inf Sci Syst</addtitle><date>2015-02-24</date><risdate>2015</risdate><volume>3</volume><issue>Suppl 1</issue><spage>S3</spage><epage>S3</epage><pages>S3-S3</pages><artnum>S3</artnum><issn>2047-2501</issn><eissn>2047-2501</eissn><abstract>Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.</abstract><cop>London</cop><pub>BioMed Central</pub><pmid>25870758</pmid><doi>10.1186/2047-2501-3-S1-S3</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2047-2501
ispartof Health information science and systems, 2015-02, Vol.3 (Suppl 1), p.S3-S3, Article S3
issn 2047-2501
2047-2501
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4383059
source SpringerNature Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Analysis
Bioinformatics
Chromosomes
Computational Biology/Bioinformatics
Computer industry
Computer Science
Genetic aspects
Genomes
Genomics
Health Informatics
Information Systems and Communication Service
Innovations
Methods
Single nucleotide polymorphisms
Supercomputers
title High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T07%3A22%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High%20performance%20computing%20enabling%20exhaustive%20analysis%20of%20higher%20order%20single%20nucleotide%20polymorphism%20interaction%20in%20Genome%20Wide%20Association%20Studies&rft.jtitle=Health%20information%20science%20and%20systems&rft.au=Goudey,%20Benjamin&rft.date=2015-02-24&rft.volume=3&rft.issue=Suppl%201&rft.spage=S3&rft.epage=S3&rft.pages=S3-S3&rft.artnum=S3&rft.issn=2047-2501&rft.eissn=2047-2501&rft_id=info:doi/10.1186/2047-2501-3-S1-S3&rft_dat=%3Cgale_pubme%3EA541652374%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1673375709&rft_id=info:pmid/25870758&rft_galeid=A541652374&rfr_iscdi=true