A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs

The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2016-01, Vol.27 (1), p.17-30
Hauptverfasser:	Wang, Yu, Qian, Weikang, Zhang, Shuchang, Liang, Xiaoyao, Yuan, Bo
Format:	Artikel
Sprache:	eng
Schlagworte:	Bayes methods Bayesian Networks Biological system modeling Clustering algorithms GPU Graphics processing units Indexes Markov processes MCMC Parallel Computing Priors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	30
container_issue	1
container_start_page	17
container_title	IEEE transactions on parallel and distributed systems
container_volume	27
creator	Wang, Yu Qian, Weikang Zhang, Shuchang Liang, Xiaoyao Yuan, Bo
description	The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.
doi_str_mv	10.1109/TPDS.2014.2387285
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2014_2387285</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7001096</ieee_id><sourcerecordid>10_1109_TPDS_2014_2387285</sourcerecordid><originalsourceid>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</originalsourceid><addsrcrecordid>eNo9kF9LwzAUxYMoOKcfQHzJF-jMzZ-2eZxzzsHQwbbnkrY3M7q2IynIvr0pG8KBex7OuRx-hDwCmwAw_bxdv24mnIGccJFnPFdXZARK5QmHXFxHz6RKNAd9S-5C-GYxqZgckc2UrtD41rV7Oj3sO-_6r4baztMXc8LgTEs_sP_t_E-gpq3psg90bq2rHLY9XTbHAzbRmd51LY1arHfhntxYcwj4cLljsnubb2fvyepzsZxNV0nFU9UnFkptylRlptLWMCkRBPJU6jLXwti4WGYi1cjSzFqA3NZYl9xYLnSmeAliTOD8t_JdCB5tcfSuMf5UACsGKsVApRioFBcqsfN07jhE_M9nkQfTqfgD2l5eLw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</creator><creatorcontrib>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</creatorcontrib><description>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2014.2387285</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bayes methods ; Bayesian Networks ; Biological system modeling ; Clustering algorithms ; GPU ; Graphics processing units ; Indexes ; Markov processes ; MCMC ; Parallel Computing ; Priors</subject><ispartof>IEEE transactions on parallel and distributed systems, 2016-01, Vol.27 (1), p.17-30</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</citedby><cites>FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7001096$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7001096$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wang, Yu</creatorcontrib><creatorcontrib>Qian, Weikang</creatorcontrib><creatorcontrib>Zhang, Shuchang</creatorcontrib><creatorcontrib>Liang, Xiaoyao</creatorcontrib><creatorcontrib>Yuan, Bo</creatorcontrib><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</description><subject>Bayes methods</subject><subject>Bayesian Networks</subject><subject>Biological system modeling</subject><subject>Clustering algorithms</subject><subject>GPU</subject><subject>Graphics processing units</subject><subject>Indexes</subject><subject>Markov processes</subject><subject>MCMC</subject><subject>Parallel Computing</subject><subject>Priors</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kF9LwzAUxYMoOKcfQHzJF-jMzZ-2eZxzzsHQwbbnkrY3M7q2IynIvr0pG8KBex7OuRx-hDwCmwAw_bxdv24mnIGccJFnPFdXZARK5QmHXFxHz6RKNAd9S-5C-GYxqZgckc2UrtD41rV7Oj3sO-_6r4baztMXc8LgTEs_sP_t_E-gpq3psg90bq2rHLY9XTbHAzbRmd51LY1arHfhntxYcwj4cLljsnubb2fvyepzsZxNV0nFU9UnFkptylRlptLWMCkRBPJU6jLXwti4WGYi1cjSzFqA3NZYl9xYLnSmeAliTOD8t_JdCB5tcfSuMf5UACsGKsVApRioFBcqsfN07jhE_M9nkQfTqfgD2l5eLw</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Wang, Yu</creator><creator>Qian, Weikang</creator><creator>Zhang, Shuchang</creator><creator>Liang, Xiaoyao</creator><creator>Yuan, Bo</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20160101</creationdate><title>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</title><author>Wang, Yu ; Qian, Weikang ; Zhang, Shuchang ; Liang, Xiaoyao ; Yuan, Bo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c265t-f1b9ab657ac9fa044e13e2649b893af10447369e067ff118fdedb2af239752b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Bayes methods</topic><topic>Bayesian Networks</topic><topic>Biological system modeling</topic><topic>Clustering algorithms</topic><topic>GPU</topic><topic>Graphics processing units</topic><topic>Indexes</topic><topic>Markov processes</topic><topic>MCMC</topic><topic>Parallel Computing</topic><topic>Priors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yu</creatorcontrib><creatorcontrib>Qian, Weikang</creatorcontrib><creatorcontrib>Zhang, Shuchang</creatorcontrib><creatorcontrib>Liang, Xiaoyao</creatorcontrib><creatorcontrib>Yuan, Bo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Yu</au><au>Qian, Weikang</au><au>Zhang, Shuchang</au><au>Liang, Xiaoyao</au><au>Yuan, Bo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2016-01-01</date><risdate>2016</risdate><volume>27</volume><issue>1</issue><spage>17</spage><epage>30</epage><pages>17-30</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>The wide application of omics research has produced a burst of biological data in recent years, which has in turn increased the need to infer biological networks from data. Learning biological networks from experimental data can help detect and analyze aberrant signaling pathways, which can be used in diagnosis of diseases at an early stage. Most networks can be modeled as Bayesian networks (BNs). However, because of its combinatorial nature, computational learning of dependent relationships underlying complex networks is NP-complete. To reduce the complexity, researchers have proposed to use Markov chain Monte Carlo (MCMC) methods to sample the solution space. MCMC methods guarantee convergence and traversability. However, MCMC is not scalable for networks with more than 40 nodes because of the computational complexity. In this work, we optimize an MCMC-based learning algorithm and implement it on a general-purpose graphics processing unit (GPGPU). We achieve a 2.46× speedup by optimizing the algorithm and an additional 58-fold acceleration by implementing it on a GPU. In total, we speed up the algorithm by 143×. As a result, we can apply this system to networks with up to 125 nodes, a size that is of interest to many biologists. Furthermore, we add artificial interventions to the scores in order to incorporate prior knowledge of interactions into the Bayesian inference, which increases the accuracy of the results. Our system provides biologists with a more computational efficient tool at a lower cost than previous works.</abstract><pub>IEEE</pub><doi>10.1109/TPDS.2014.2387285</doi><tpages>14</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1045-9219
ispartof	IEEE transactions on parallel and distributed systems, 2016-01, Vol.27 (1), p.17-30
issn	1045-9219 1558-2183
language	eng
recordid	cdi_crossref_primary_10_1109_TPDS_2014_2387285
source	IEEE Electronic Library (IEL)
subjects	Bayes methods Bayesian Networks Biological system modeling Clustering algorithms GPU Graphics processing units Indexes Markov processes MCMC Parallel Computing Priors
title	A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T22%3A14%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Learning%20Algorithm%20for%20Bayesian%20Networks%20and%20Its%20Efficient%20Implementation%20on%20GPUs&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Wang,%20Yu&rft.date=2016-01-01&rft.volume=27&rft.issue=1&rft.spage=17&rft.epage=30&rft.pages=17-30&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2014.2387285&rft_dat=%3Ccrossref_RIE%3E10_1109_TPDS_2014_2387285%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7001096&rfr_iscdi=true