Dense Hebbian neural networks: a replica symmetric picture of supervised learning

We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-07
Hauptverfasser: Agliari, Elena, Albanese, Linda, Alemanno, Francesco, Alessandrelli, Andrea, Barra, Adriano, Giannotti, Fosca, Lotito, Daniele, Pedreschi, Dino
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Agliari, Elena
Albanese, Linda
Alemanno, Francesco
Alessandrelli, Andrea
Barra, Adriano
Giannotti, Fosca
Lotito, Daniele
Pedreschi, Dino
description We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.
doi_str_mv 10.48550/arxiv.2212.00606
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2212_00606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2775131717</sourcerecordid><originalsourceid>FETCH-LOGICAL-a956-8fcef25efa0dbcbe989238d60d4721d9e4e37222aed4b480875162d570ddb9923</originalsourceid><addsrcrecordid>eNotj11LwzAYRoMgOOZ-gFcGvO5M3jRN6p1M3YSBCLsvSfNWMvtl0k73762bV-fm8PAcQm44W6ZaSnZvwo8_LAE4LBnLWHZBZiAET3QKcEUWMe4ZY5ApkFLMyPsTthHpBq31pqUtjsHUE4bvLnzGB2powL72paHx2DQ4BF_S3pfDGJB2FY1jj-HgIzpaowmtbz-uyWVl6oiLf87J7uV5t9ok27f16-pxm5hcZomuSqxAYmWYs6XFXOcgtMuYSxVwl2OKQgGAQZfaVDOtJM_AScWcs_nkzsntefbUW_TBNyYci7_u4tQ9GXdnow_d14hxKPbdGNrpUwFqmhNccSV-ATl0W4Y</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2775131717</pqid></control><display><type>article</type><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</creator><creatorcontrib>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</creatorcontrib><description>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2212.00606</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Interpolation ; Machine learning ; Neural networks ; Pattern recognition ; Phase diagrams ; Physics - Disordered Systems and Neural Networks ; Spin glasses ; Stability analysis ; Statistics - Machine Learning ; Storage ; Supervised learning ; Tensors</subject><ispartof>arXiv.org, 2023-07</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1016/j.physa.2023.129076$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2212.00606$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Agliari, Elena</creatorcontrib><creatorcontrib>Albanese, Linda</creatorcontrib><creatorcontrib>Alemanno, Francesco</creatorcontrib><creatorcontrib>Alessandrelli, Andrea</creatorcontrib><creatorcontrib>Barra, Adriano</creatorcontrib><creatorcontrib>Giannotti, Fosca</creatorcontrib><creatorcontrib>Lotito, Daniele</creatorcontrib><creatorcontrib>Pedreschi, Dino</creatorcontrib><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><title>arXiv.org</title><description>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</description><subject>Datasets</subject><subject>Interpolation</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Pattern recognition</subject><subject>Phase diagrams</subject><subject>Physics - Disordered Systems and Neural Networks</subject><subject>Spin glasses</subject><subject>Stability analysis</subject><subject>Statistics - Machine Learning</subject><subject>Storage</subject><subject>Supervised learning</subject><subject>Tensors</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYRoMgOOZ-gFcGvO5M3jRN6p1M3YSBCLsvSfNWMvtl0k73762bV-fm8PAcQm44W6ZaSnZvwo8_LAE4LBnLWHZBZiAET3QKcEUWMe4ZY5ApkFLMyPsTthHpBq31pqUtjsHUE4bvLnzGB2powL72paHx2DQ4BF_S3pfDGJB2FY1jj-HgIzpaowmtbz-uyWVl6oiLf87J7uV5t9ok27f16-pxm5hcZomuSqxAYmWYs6XFXOcgtMuYSxVwl2OKQgGAQZfaVDOtJM_AScWcs_nkzsntefbUW_TBNyYci7_u4tQ9GXdnow_d14hxKPbdGNrpUwFqmhNccSV-ATl0W4Y</recordid><startdate>20230702</startdate><enddate>20230702</enddate><creator>Agliari, Elena</creator><creator>Albanese, Linda</creator><creator>Alemanno, Francesco</creator><creator>Alessandrelli, Andrea</creator><creator>Barra, Adriano</creator><creator>Giannotti, Fosca</creator><creator>Lotito, Daniele</creator><creator>Pedreschi, Dino</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20230702</creationdate><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><author>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a956-8fcef25efa0dbcbe989238d60d4721d9e4e37222aed4b480875162d570ddb9923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Datasets</topic><topic>Interpolation</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Pattern recognition</topic><topic>Phase diagrams</topic><topic>Physics - Disordered Systems and Neural Networks</topic><topic>Spin glasses</topic><topic>Stability analysis</topic><topic>Statistics - Machine Learning</topic><topic>Storage</topic><topic>Supervised learning</topic><topic>Tensors</topic><toplevel>online_resources</toplevel><creatorcontrib>Agliari, Elena</creatorcontrib><creatorcontrib>Albanese, Linda</creatorcontrib><creatorcontrib>Alemanno, Francesco</creatorcontrib><creatorcontrib>Alessandrelli, Andrea</creatorcontrib><creatorcontrib>Barra, Adriano</creatorcontrib><creatorcontrib>Giannotti, Fosca</creatorcontrib><creatorcontrib>Lotito, Daniele</creatorcontrib><creatorcontrib>Pedreschi, Dino</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agliari, Elena</au><au>Albanese, Linda</au><au>Alemanno, Francesco</au><au>Alessandrelli, Andrea</au><au>Barra, Adriano</au><au>Giannotti, Fosca</au><au>Lotito, Daniele</au><au>Pedreschi, Dino</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</atitle><jtitle>arXiv.org</jtitle><date>2023-07-02</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2212.00606</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-07
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2212_00606
source arXiv.org; Free E- Journals
subjects Datasets
Interpolation
Machine learning
Neural networks
Pattern recognition
Phase diagrams
Physics - Disordered Systems and Neural Networks
Spin glasses
Stability analysis
Statistics - Machine Learning
Storage
Supervised learning
Tensors
title Dense Hebbian neural networks: a replica symmetric picture of supervised learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T23%3A03%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dense%20Hebbian%20neural%20networks:%20a%20replica%20symmetric%20picture%20of%20supervised%20learning&rft.jtitle=arXiv.org&rft.au=Agliari,%20Elena&rft.date=2023-07-02&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2212.00606&rft_dat=%3Cproquest_arxiv%3E2775131717%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2775131717&rft_id=info:pmid/&rfr_iscdi=true