A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs

The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2016-01, Vol.27 (1), p.17-30
Hauptverfasser: Wang, Yu, Qian, Weikang, Zhang, Shuchang, Liang, Xiaoyao, Yuan, Bo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 30
container_issue 1
container_start_page 17
container_title IEEE transactions on parallel and distributed systems
container_volume 27
creator Wang, Yu
Qian, Weikang
Zhang, Shuchang
Liang, Xiaoyao
Yuan, Bo
description The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.
doi_str_mv 10.1109/TPDS.2014.2387285
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2014_2387285</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7001096</ieee_id><sourcerecordid>10_1109_TPDS_2014_2387285</sourcerecordid><originalsourceid>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</originalsourceid><addsrcrecordid>eNo9kF9LwzAUxYMoOKcfQHzJF-jMzZ-2eZxzzsHQwbbnkrY3M7q2IynIvr0pG8KBex7OuRx-hDwCmwAw_bxdv24mnIGccJFnPFdXZARK5QmHXFxHz6RKNAd9S-5C-GYxqZgckc2UrtD41rV7Oj3sO-_6r4baztMXc8LgTEs_sP_t_E-gpq3psg90bq2rHLY9XTbHAzbRmd51LY1arHfhntxYcwj4cLljsnubb2fvyepzsZxNV0nFU9UnFkptylRlptLWMCkRBPJU6jLXwti4WGYi1cjSzFqA3NZYl9xYLnSmeAliTOD8t_JdCB5tcfSuMf5UACsGKsVApRioFBcqsfN07jhE_M9nkQfTqfgD2l5eLw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</creator><creatorcontrib>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</creatorcontrib><description>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2014.2387285</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bayes methods ; Bayesian Networks ; Biological system modeling ; Clustering algorithms ; GPU ; Graphics processing units ; Indexes ; Markov processes ; MCMC ; Parallel Computing ; Priors</subject><ispartof>IEEE transactions on parallel and distributed systems, 2016-01, Vol.27 (1), p.17-30</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</citedby><cites>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7001096$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7001096$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Yu</creatorcontrib><creatorcontrib>Qian, Weikang</creatorcontrib><creatorcontrib>Zhang, Shuchang</creatorcontrib><creatorcontrib>Liang, Xiaoyao</creatorcontrib><creatorcontrib>Yuan, Bo</creatorcontrib><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</description><subject>Bayes methods</subject><subject>Bayesian Networks</subject><subject>Biological system modeling</subject><subject>Clustering algorithms</subject><subject>GPU</subject><subject>Graphics processing units</subject><subject>Indexes</subject><subject>Markov processes</subject><subject>MCMC</subject><subject>Parallel Computing</subject><subject>Priors</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kF9LwzAUxYMoOKcfQHzJF-jMzZ-2eZxzzsHQwbbnkrY3M7q2IynIvr0pG8KBex7OuRx-hDwCmwAw_bxdv24mnIGccJFnPFdXZARK5QmHXFxHz6RKNAd9S-5C-GYxqZgckc2UrtD41rV7Oj3sO-_6r4baztMXc8LgTEs_sP_t_E-gpq3psg90bq2rHLY9XTbHAzbRmd51LY1arHfhntxYcwj4cLljsnubb2fvyepzsZxNV0nFU9UnFkptylRlptLWMCkRBPJU6jLXwti4WGYi1cjSzFqA3NZYl9xYLnSmeAliTOD8t_JdCB5tcfSuMf5UACsGKsVApRioFBcqsfN07jhE_M9nkQfTqfgD2l5eLw</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Wang, Yu</creator><creator>Qian, Weikang</creator><creator>Zhang, Shuchang</creator><creator>Liang, Xiaoyao</creator><creator>Yuan, Bo</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20160101</creationdate><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><author>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Bayes methods</topic><topic>Bayesian Networks</topic><topic>Biological system modeling</topic><topic>Clustering algorithms</topic><topic>GPU</topic><topic>Graphics processing units</topic><topic>Indexes</topic><topic>Markov processes</topic><topic>MCMC</topic><topic>Parallel Computing</topic><topic>Priors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yu</creatorcontrib><creatorcontrib>Qian, Weikang</creatorcontrib><creatorcontrib>Zhang, Shuchang</creatorcontrib><creatorcontrib>Liang, Xiaoyao</creatorcontrib><creatorcontrib>Yuan, Bo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Yu</au><au>Qian, Weikang</au><au>Zhang, Shuchang</au><au>Liang, Xiaoyao</au><au>Yuan, Bo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2016-01-01</date><risdate>2016</risdate><volume>27</volume><issue>1</issue><spage>17</spage><epage>30</epage><pages>17-30</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</abstract><pub>IEEE</pub><doi>10.1109/TPDS.2014.2387285</doi><tpages>14</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1045-9219
ispartof IEEE transactions on parallel and distributed systems, 2016-01, Vol.27 (1), p.17-30
issn 1045-9219
1558-2183
language eng
recordid cdi_crossref_primary_10_1109_TPDS_2014_2387285
source IEEE Electronic Library (IEL)
subjects Bayes methods
Bayesian Networks
Biological system modeling
Clustering algorithms
GPU
Graphics processing units
Indexes
Markov processes
MCMC
Parallel Computing
Priors
title A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T22%3A14%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Learning%20Algorithm%20for%20Bayesian%20Networks%20and%20Its%20Efficient%20Implementation%20on%20GPUs&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Wang,%20Yu&rft.date=2016-01-01&rft.volume=27&rft.issue=1&rft.spage=17&rft.epage=30&rft.pages=17-30&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2014.2387285&rft_dat=%3Ccrossref_RIE%3E10_1109_TPDS_2014_2387285%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7001096&rfr_iscdi=true