Dense Hebbian neural networks: a replica symmetric picture of supervised learning
We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-07 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Agliari, Elena Albanese, Linda Alemanno, Francesco Alessandrelli, Andrea Barra, Adriano Giannotti, Fosca Lotito, Daniele Pedreschi, Dino |
description | We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general. |
doi_str_mv | 10.48550/arxiv.2212.00606 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2212_00606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2775131717</sourcerecordid><originalsourceid>FETCH-LOGICAL-a956-8fcef25efa0dbcbe989238d60d4721d9e4e37222aed4b480875162d570ddb9923</originalsourceid><addsrcrecordid>eNotj11LwzAYRoMgOOZ-gFcGvO5M3jRN6p1M3YSBCLsvSfNWMvtl0k73762bV-fm8PAcQm44W6ZaSnZvwo8_LAE4LBnLWHZBZiAET3QKcEUWMe4ZY5ApkFLMyPsTthHpBq31pqUtjsHUE4bvLnzGB2powL72paHx2DQ4BF_S3pfDGJB2FY1jj-HgIzpaowmtbz-uyWVl6oiLf87J7uV5t9ok27f16-pxm5hcZomuSqxAYmWYs6XFXOcgtMuYSxVwl2OKQgGAQZfaVDOtJM_AScWcs_nkzsntefbUW_TBNyYci7_u4tQ9GXdnow_d14hxKPbdGNrpUwFqmhNccSV-ATl0W4Y</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2775131717</pqid></control><display><type>article</type><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</creator><creatorcontrib>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</creatorcontrib><description>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2212.00606</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Interpolation ; Machine learning ; Neural networks ; Pattern recognition ; Phase diagrams ; Physics - Disordered Systems and Neural Networks ; Spin glasses ; Stability analysis ; Statistics - Machine Learning ; Storage ; Supervised learning ; Tensors</subject><ispartof>arXiv.org, 2023-07</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1016/j.physa.2023.129076$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2212.00606$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Agliari, Elena</creatorcontrib><creatorcontrib>Albanese, Linda</creatorcontrib><creatorcontrib>Alemanno, Francesco</creatorcontrib><creatorcontrib>Alessandrelli, Andrea</creatorcontrib><creatorcontrib>Barra, Adriano</creatorcontrib><creatorcontrib>Giannotti, Fosca</creatorcontrib><creatorcontrib>Lotito, Daniele</creatorcontrib><creatorcontrib>Pedreschi, Dino</creatorcontrib><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><title>arXiv.org</title><description>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</description><subject>Datasets</subject><subject>Interpolation</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Pattern recognition</subject><subject>Phase diagrams</subject><subject>Physics - Disordered Systems and Neural Networks</subject><subject>Spin glasses</subject><subject>Stability analysis</subject><subject>Statistics - Machine Learning</subject><subject>Storage</subject><subject>Supervised learning</subject><subject>Tensors</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj11LwzAYRoMgOOZ-gFcGvO5M3jRN6p1M3YSBCLsvSfNWMvtl0k73762bV-fm8PAcQm44W6ZaSnZvwo8_LAE4LBnLWHZBZiAET3QKcEUWMe4ZY5ApkFLMyPsTthHpBq31pqUtjsHUE4bvLnzGB2powL72paHx2DQ4BF_S3pfDGJB2FY1jj-HgIzpaowmtbz-uyWVl6oiLf87J7uV5t9ok27f16-pxm5hcZomuSqxAYmWYs6XFXOcgtMuYSxVwl2OKQgGAQZfaVDOtJM_AScWcs_nkzsntefbUW_TBNyYci7_u4tQ9GXdnow_d14hxKPbdGNrpUwFqmhNccSV-ATl0W4Y</recordid><startdate>20230702</startdate><enddate>20230702</enddate><creator>Agliari, Elena</creator><creator>Albanese, Linda</creator><creator>Alemanno, Francesco</creator><creator>Alessandrelli, Andrea</creator><creator>Barra, Adriano</creator><creator>Giannotti, Fosca</creator><creator>Lotito, Daniele</creator><creator>Pedreschi, Dino</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20230702</creationdate><title>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</title><author>Agliari, Elena ; Albanese, Linda ; Alemanno, Francesco ; Alessandrelli, Andrea ; Barra, Adriano ; Giannotti, Fosca ; Lotito, Daniele ; Pedreschi, Dino</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a956-8fcef25efa0dbcbe989238d60d4721d9e4e37222aed4b480875162d570ddb9923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Datasets</topic><topic>Interpolation</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Pattern recognition</topic><topic>Phase diagrams</topic><topic>Physics - Disordered Systems and Neural Networks</topic><topic>Spin glasses</topic><topic>Stability analysis</topic><topic>Statistics - Machine Learning</topic><topic>Storage</topic><topic>Supervised learning</topic><topic>Tensors</topic><toplevel>online_resources</toplevel><creatorcontrib>Agliari, Elena</creatorcontrib><creatorcontrib>Albanese, Linda</creatorcontrib><creatorcontrib>Alemanno, Francesco</creatorcontrib><creatorcontrib>Alessandrelli, Andrea</creatorcontrib><creatorcontrib>Barra, Adriano</creatorcontrib><creatorcontrib>Giannotti, Fosca</creatorcontrib><creatorcontrib>Lotito, Daniele</creatorcontrib><creatorcontrib>Pedreschi, Dino</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agliari, Elena</au><au>Albanese, Linda</au><au>Alemanno, Francesco</au><au>Alessandrelli, Andrea</au><au>Barra, Adriano</au><au>Giannotti, Fosca</au><au>Lotito, Daniele</au><au>Pedreschi, Dino</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dense Hebbian neural networks: a replica symmetric picture of supervised learning</atitle><jtitle>arXiv.org</jtitle><date>2023-07-02</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2212.00606</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-07 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2212_00606 |
source | arXiv.org; Free E- Journals |
subjects | Datasets Interpolation Machine learning Neural networks Pattern recognition Phase diagrams Physics - Disordered Systems and Neural Networks Spin glasses Stability analysis Statistics - Machine Learning Storage Supervised learning Tensors |
title | Dense Hebbian neural networks: a replica symmetric picture of supervised learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T23%3A03%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dense%20Hebbian%20neural%20networks:%20a%20replica%20symmetric%20picture%20of%20supervised%20learning&rft.jtitle=arXiv.org&rft.au=Agliari,%20Elena&rft.date=2023-07-02&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2212.00606&rft_dat=%3Cproquest_arxiv%3E2775131717%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2775131717&rft_id=info:pmid/&rfr_iscdi=true |