Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes

We extend feed-forward neural networks with a Dirichlet process prior over the weight distribution. This enforces a sharing on the network weights, which can reduce the overall number of parameters drastically. We alternately sample from the posterior of the weights and the posterior of assignments...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2020-01, Vol.42 (1), p.246-252
Hauptverfasser: Roth, Wolfgang, Pernkopf, Franz
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 252
container_issue 1
container_start_page 246
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 42
creator Roth, Wolfgang
Pernkopf, Franz
description We extend feed-forward neural networks with a Dirichlet process prior over the weight distribution. This enforces a sharing on the network weights, which can reduce the overall number of parameters drastically. We alternately sample from the posterior of the weights and the posterior of assignments of network connections to the weights. This results in a weight sharing that is adopted to the given data. In order to make the procedure feasible, we present several techniques to reduce the computational burden. Experiments show that our approach mostly outperforms models with random weight sharing. Our model is capable of reducing the memory footprint substantially while maintaining a good performance compared to neural networks without weight sharing.
doi_str_mv 10.1109/TPAMI.2018.2884905
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2322749528</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8566011</ieee_id><sourcerecordid>2155152181</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-29ba1185367fcc4a8071ff5b120220fb3aeaa19bf24a2bdd81ec56a710ca0baa3</originalsourceid><addsrcrecordid>eNpdkE1P20AQhldVq5LS_oFWqixx6cVhZtbrrI98g0QpUkE9rsabMVnqxLBrC_HvcUjKoZd5D_O8o9Gj1FeEKSJU-zfXBz8vpgRop2RtUYF5pyZY6SrXRlfv1QSwpNxasjvqU0r3AFgY0B_VjgajYYQm6vSQnyUFXmVXMkRux-ifuvg3ZU-hX2R_JNwt-uz3gmNY3WW3aT2PQwx-0UqfXcfOS0qSPqsPDbdJvmxzV92entwcneeXv84ujg4uc68N9jlVNSNao8tZ433BFmbYNKZGAiJoas3CjFXdUMFUz-cWxZuSZwieoWbWu-rH5u5D7B4HSb1bhuSlbXkl3ZAcoTFoCC2O6N5_6H03xNX4nSNNNCsqQ3akaEP52KUUpXEPMSw5PjsEt7bsXi27tWW3tTyWvm9PD_VS5m-Vf1pH4NsGCCLytramLAFRvwAudIA5</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2322749528</pqid></control><display><type>article</type><title>Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes</title><source>IEEE Electronic Library (IEL)</source><creator>Roth, Wolfgang ; Pernkopf, Franz</creator><creatorcontrib>Roth, Wolfgang ; Pernkopf, Franz</creatorcontrib><description>We extend feed-forward neural networks with a Dirichlet process prior over the weight distribution. This enforces a sharing on the network weights, which can reduce the overall number of parameters drastically. We alternately sample from the posterior of the weights and the posterior of assignments of network connections to the weights. This results in a weight sharing that is adopted to the given data. In order to make the procedure feasible, we present several techniques to reduce the computational burden. Experiments show that our approach mostly outperforms models with random weight sharing. Our model is capable of reducing the memory footprint substantially while maintaining a good performance compared to neural networks without weight sharing.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2018.2884905</identifier><identifier>PMID: 30530353</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Artificial neural networks ; Bayes methods ; Bayesian neural networks ; Computational modeling ; Dirichlet problem ; Dirichlet processes ; Gibbs sampling ; hybrid Monte-Carlo ; Memory management ; Monte Carlo methods ; Neural networks ; non-conjugate models ; Task analysis ; Weight ; weight sharing</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2020-01, Vol.42 (1), p.246-252</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-29ba1185367fcc4a8071ff5b120220fb3aeaa19bf24a2bdd81ec56a710ca0baa3</citedby><cites>FETCH-LOGICAL-c351t-29ba1185367fcc4a8071ff5b120220fb3aeaa19bf24a2bdd81ec56a710ca0baa3</cites><orcidid>0000-0002-6356-3367 ; 0000-0002-0778-2418</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8566011$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8566011$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30530353$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Roth, Wolfgang</creatorcontrib><creatorcontrib>Pernkopf, Franz</creatorcontrib><title>Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>We extend feed-forward neural networks with a Dirichlet process prior over the weight distribution. This enforces a sharing on the network weights, which can reduce the overall number of parameters drastically. We alternately sample from the posterior of the weights and the posterior of assignments of network connections to the weights. This results in a weight sharing that is adopted to the given data. In order to make the procedure feasible, we present several techniques to reduce the computational burden. Experiments show that our approach mostly outperforms models with random weight sharing. Our model is capable of reducing the memory footprint substantially while maintaining a good performance compared to neural networks without weight sharing.</description><subject>Artificial neural networks</subject><subject>Bayes methods</subject><subject>Bayesian neural networks</subject><subject>Computational modeling</subject><subject>Dirichlet problem</subject><subject>Dirichlet processes</subject><subject>Gibbs sampling</subject><subject>hybrid Monte-Carlo</subject><subject>Memory management</subject><subject>Monte Carlo methods</subject><subject>Neural networks</subject><subject>non-conjugate models</subject><subject>Task analysis</subject><subject>Weight</subject><subject>weight sharing</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1P20AQhldVq5LS_oFWqixx6cVhZtbrrI98g0QpUkE9rsabMVnqxLBrC_HvcUjKoZd5D_O8o9Gj1FeEKSJU-zfXBz8vpgRop2RtUYF5pyZY6SrXRlfv1QSwpNxasjvqU0r3AFgY0B_VjgajYYQm6vSQnyUFXmVXMkRux-ifuvg3ZU-hX2R_JNwt-uz3gmNY3WW3aT2PQwx-0UqfXcfOS0qSPqsPDbdJvmxzV92entwcneeXv84ujg4uc68N9jlVNSNao8tZ433BFmbYNKZGAiJoas3CjFXdUMFUz-cWxZuSZwieoWbWu-rH5u5D7B4HSb1bhuSlbXkl3ZAcoTFoCC2O6N5_6H03xNX4nSNNNCsqQ3akaEP52KUUpXEPMSw5PjsEt7bsXi27tWW3tTyWvm9PD_VS5m-Vf1pH4NsGCCLytramLAFRvwAudIA5</recordid><startdate>20200101</startdate><enddate>20200101</enddate><creator>Roth, Wolfgang</creator><creator>Pernkopf, Franz</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-6356-3367</orcidid><orcidid>https://orcid.org/0000-0002-0778-2418</orcidid></search><sort><creationdate>20200101</creationdate><title>Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes</title><author>Roth, Wolfgang ; Pernkopf, Franz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-29ba1185367fcc4a8071ff5b120220fb3aeaa19bf24a2bdd81ec56a710ca0baa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Bayes methods</topic><topic>Bayesian neural networks</topic><topic>Computational modeling</topic><topic>Dirichlet problem</topic><topic>Dirichlet processes</topic><topic>Gibbs sampling</topic><topic>hybrid Monte-Carlo</topic><topic>Memory management</topic><topic>Monte Carlo methods</topic><topic>Neural networks</topic><topic>non-conjugate models</topic><topic>Task analysis</topic><topic>Weight</topic><topic>weight sharing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Roth, Wolfgang</creatorcontrib><creatorcontrib>Pernkopf, Franz</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Roth, Wolfgang</au><au>Pernkopf, Franz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2020-01-01</date><risdate>2020</risdate><volume>42</volume><issue>1</issue><spage>246</spage><epage>252</epage><pages>246-252</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>We extend feed-forward neural networks with a Dirichlet process prior over the weight distribution. This enforces a sharing on the network weights, which can reduce the overall number of parameters drastically. We alternately sample from the posterior of the weights and the posterior of assignments of network connections to the weights. This results in a weight sharing that is adopted to the given data. In order to make the procedure feasible, we present several techniques to reduce the computational burden. Experiments show that our approach mostly outperforms models with random weight sharing. Our model is capable of reducing the memory footprint substantially while maintaining a good performance compared to neural networks without weight sharing.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>30530353</pmid><doi>10.1109/TPAMI.2018.2884905</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0002-6356-3367</orcidid><orcidid>https://orcid.org/0000-0002-0778-2418</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2020-01, Vol.42 (1), p.246-252
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_proquest_journals_2322749528
source IEEE Electronic Library (IEL)
subjects Artificial neural networks
Bayes methods
Bayesian neural networks
Computational modeling
Dirichlet problem
Dirichlet processes
Gibbs sampling
hybrid Monte-Carlo
Memory management
Monte Carlo methods
Neural networks
non-conjugate models
Task analysis
Weight
weight sharing
title Bayesian Neural Networks with Weight Sharing Using Dirichlet Processes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T13%3A52%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bayesian%20Neural%20Networks%20with%20Weight%20Sharing%20Using%20Dirichlet%20Processes&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Roth,%20Wolfgang&rft.date=2020-01-01&rft.volume=42&rft.issue=1&rft.spage=246&rft.epage=252&rft.pages=246-252&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2018.2884905&rft_dat=%3Cproquest_RIE%3E2155152181%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2322749528&rft_id=info:pmid/30530353&rft_ieee_id=8566011&rfr_iscdi=true