Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T

The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. Thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Human brain mapping 2022-10, Vol.43 (15), p.4750-4790
Hauptverfasser: Colas, Jaron T., Dundon, Neil M., Gerraty, Raphael T., Saragosa‐Harris, Natalie M., Szymula, Karol P., Tanwisuth, Koranis, Tyszka, J. Michael, Geen, Camilla, Ju, Harang, Toga, Arthur W., Gold, Joshua I., Bassett, Dani S., Hartley, Catherine A., Shohamy, Daphna, Grafton, Scott T., O'Doherty, John P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4790
container_issue 15
container_start_page 4750
container_title Human brain mapping
container_volume 43
creator Colas, Jaron T.
Dundon, Neil M.
Gerraty, Raphael T.
Saragosa‐Harris, Natalie M.
Szymula, Karol P.
Tanwisuth, Koranis
Tyszka, J. Michael
Geen, Camilla
Ju, Harang
Toga, Arthur W.
Gold, Joshua I.
Bassett, Dani S.
Hartley, Catherine A.
Shohamy, Daphna
Grafton, Scott T.
O'Doherty, John P.
description The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific eff
doi_str_mv 10.1002/hbm.25988
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2692755519</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2716051565</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3888-342db4459c81c24224712e5e31b23f96d313a2c12494c6d4f79159ce8ec68af73</originalsourceid><addsrcrecordid>eNp1kc9O3DAQxi1UVOi2B16gstQLHAL-m9jcyoqySIuQED1bXmeyGCUO2Nmi5Wl4lj5ZvRvoAYmTxzM_fZpvPoQOKDmmhLCTu0V3zKRWagftU6KrglDNP23qUhZaVHQPfUnpnhBKJaGf0R6XqiRain20vgEfmj466CAMuAUbgw9L_OSHO2xT6p23g_8DuI-49slF3_kwdpYQINrWP-dvH7B1sU8Jp8EOkLANde5sBukUN1c3l9gOmP99ud1Oqlx8RbuNbRN8e30n6Pev89vprJhfX1xOf84Lx5VSBResXgghtVPUMcFYtsNAAqcLxhtd1pxyyxxlQgtX1qKpNM0wKHClsk3FJ-hw1H2I_eMK0mC67APa1gboV8mwUrNKSplPNkE_3qH3_SqGvJ1hFS2JpLKUmToaqa3hCI15yFexcW0oMZs8TM7DbPPI7PdXxdWig_o_-RZABk5G4Mm3sP5YyczOrkbJfx32lT4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2716051565</pqid></control><display><type>article</type><title>Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T</title><source>MEDLINE</source><source>Wiley Online Library Open Access</source><source>DOAJ Directory of Open Access Journals</source><source>Wiley Online Library Journals Frontfile Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Colas, Jaron T. ; Dundon, Neil M. ; Gerraty, Raphael T. ; Saragosa‐Harris, Natalie M. ; Szymula, Karol P. ; Tanwisuth, Koranis ; Tyszka, J. Michael ; Geen, Camilla ; Ju, Harang ; Toga, Arthur W. ; Gold, Joshua I. ; Bassett, Dani S. ; Hartley, Catherine A. ; Shohamy, Daphna ; Grafton, Scott T. ; O'Doherty, John P.</creator><creatorcontrib>Colas, Jaron T. ; Dundon, Neil M. ; Gerraty, Raphael T. ; Saragosa‐Harris, Natalie M. ; Szymula, Karol P. ; Tanwisuth, Koranis ; Tyszka, J. Michael ; Geen, Camilla ; Ju, Harang ; Toga, Arthur W. ; Gold, Joshua I. ; Bassett, Dani S. ; Hartley, Catherine A. ; Shohamy, Daphna ; Grafton, Scott T. ; O'Doherty, John P.</creatorcontrib><description>The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus.</description><identifier>ISSN: 1065-9471</identifier><identifier>ISSN: 1097-0193</identifier><identifier>EISSN: 1097-0193</identifier><identifier>DOI: 10.1002/hbm.25988</identifier><identifier>PMID: 35860954</identifier><language>eng</language><publisher>Hoboken, USA: John Wiley &amp; Sons, Inc</publisher><subject>Algorithms ; Artificial intelligence ; Cognitive ability ; cognitive map ; Cognitive maps ; Cognitive models ; Cognitive tasks ; counterfactual learning ; Dopamine receptors ; dopaminergic midbrain ; Functional magnetic resonance imaging ; generalization ; Generalization, Psychological ; hippocampus ; Humans ; individual differences ; Learning ; Machine learning ; Magnetic Resonance Imaging - methods ; Mesencephalon ; model‐free and model‐based ; multifield fMRI ; Neostriatum ; Neurosciences ; Reinforcement ; reinforcement learning ; Reinforcement, Psychology ; Reversal learning ; Reward ; striatum ; Structural hierarchy ; Substantia nigra ; Ventral tegmentum</subject><ispartof>Human brain mapping, 2022-10, Vol.43 (15), p.4750-4790</ispartof><rights>2022 The Authors. published by Wiley Periodicals LLC.</rights><rights>2022 The Authors. Human Brain Mapping published by Wiley Periodicals LLC.</rights><rights>2022. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3888-342db4459c81c24224712e5e31b23f96d313a2c12494c6d4f79159ce8ec68af73</citedby><cites>FETCH-LOGICAL-c3888-342db4459c81c24224712e5e31b23f96d313a2c12494c6d4f79159ce8ec68af73</cites><orcidid>0000-0003-1822-0688 ; 0000-0001-9782-1005 ; 0000-0001-6246-1775 ; 0000-0002-4493-6113 ; 0000-0003-4239-4960 ; 0000-0003-3563-6781 ; 0000-0001-9342-9014 ; 0000-0002-6018-0483 ; 0000-0002-4948-5550 ; 0000-0003-1904-1753 ; 0000-0003-1872-7614 ; 0000-0003-0177-7295 ; 0000-0003-4015-3151 ; 0000-0003-0016-3531 ; 0000-0002-6183-4493 ; 0000-0001-7902-3755</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fhbm.25988$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fhbm.25988$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,1411,11541,27901,27902,45550,45551,46027,46451</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35860954$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Colas, Jaron T.</creatorcontrib><creatorcontrib>Dundon, Neil M.</creatorcontrib><creatorcontrib>Gerraty, Raphael T.</creatorcontrib><creatorcontrib>Saragosa‐Harris, Natalie M.</creatorcontrib><creatorcontrib>Szymula, Karol P.</creatorcontrib><creatorcontrib>Tanwisuth, Koranis</creatorcontrib><creatorcontrib>Tyszka, J. Michael</creatorcontrib><creatorcontrib>Geen, Camilla</creatorcontrib><creatorcontrib>Ju, Harang</creatorcontrib><creatorcontrib>Toga, Arthur W.</creatorcontrib><creatorcontrib>Gold, Joshua I.</creatorcontrib><creatorcontrib>Bassett, Dani S.</creatorcontrib><creatorcontrib>Hartley, Catherine A.</creatorcontrib><creatorcontrib>Shohamy, Daphna</creatorcontrib><creatorcontrib>Grafton, Scott T.</creatorcontrib><creatorcontrib>O'Doherty, John P.</creatorcontrib><title>Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T</title><title>Human brain mapping</title><addtitle>Hum Brain Mapp</addtitle><description>The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Cognitive ability</subject><subject>cognitive map</subject><subject>Cognitive maps</subject><subject>Cognitive models</subject><subject>Cognitive tasks</subject><subject>counterfactual learning</subject><subject>Dopamine receptors</subject><subject>dopaminergic midbrain</subject><subject>Functional magnetic resonance imaging</subject><subject>generalization</subject><subject>Generalization, Psychological</subject><subject>hippocampus</subject><subject>Humans</subject><subject>individual differences</subject><subject>Learning</subject><subject>Machine learning</subject><subject>Magnetic Resonance Imaging - methods</subject><subject>Mesencephalon</subject><subject>model‐free and model‐based</subject><subject>multifield fMRI</subject><subject>Neostriatum</subject><subject>Neurosciences</subject><subject>Reinforcement</subject><subject>reinforcement learning</subject><subject>Reinforcement, Psychology</subject><subject>Reversal learning</subject><subject>Reward</subject><subject>striatum</subject><subject>Structural hierarchy</subject><subject>Substantia nigra</subject><subject>Ventral tegmentum</subject><issn>1065-9471</issn><issn>1097-0193</issn><issn>1097-0193</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>EIF</sourceid><recordid>eNp1kc9O3DAQxi1UVOi2B16gstQLHAL-m9jcyoqySIuQED1bXmeyGCUO2Nmi5Wl4lj5ZvRvoAYmTxzM_fZpvPoQOKDmmhLCTu0V3zKRWagftU6KrglDNP23qUhZaVHQPfUnpnhBKJaGf0R6XqiRain20vgEfmj466CAMuAUbgw9L_OSHO2xT6p23g_8DuI-49slF3_kwdpYQINrWP-dvH7B1sU8Jp8EOkLANde5sBukUN1c3l9gOmP99ud1Oqlx8RbuNbRN8e30n6Pev89vprJhfX1xOf84Lx5VSBResXgghtVPUMcFYtsNAAqcLxhtd1pxyyxxlQgtX1qKpNM0wKHClsk3FJ-hw1H2I_eMK0mC67APa1gboV8mwUrNKSplPNkE_3qH3_SqGvJ1hFS2JpLKUmToaqa3hCI15yFexcW0oMZs8TM7DbPPI7PdXxdWig_o_-RZABk5G4Mm3sP5YyczOrkbJfx32lT4</recordid><startdate>20221015</startdate><enddate>20221015</enddate><creator>Colas, Jaron T.</creator><creator>Dundon, Neil M.</creator><creator>Gerraty, Raphael T.</creator><creator>Saragosa‐Harris, Natalie M.</creator><creator>Szymula, Karol P.</creator><creator>Tanwisuth, Koranis</creator><creator>Tyszka, J. Michael</creator><creator>Geen, Camilla</creator><creator>Ju, Harang</creator><creator>Toga, Arthur W.</creator><creator>Gold, Joshua I.</creator><creator>Bassett, Dani S.</creator><creator>Hartley, Catherine A.</creator><creator>Shohamy, Daphna</creator><creator>Grafton, Scott T.</creator><creator>O'Doherty, John P.</creator><general>John Wiley &amp; Sons, Inc</general><scope>24P</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QR</scope><scope>7TK</scope><scope>7U7</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>K9.</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-1822-0688</orcidid><orcidid>https://orcid.org/0000-0001-9782-1005</orcidid><orcidid>https://orcid.org/0000-0001-6246-1775</orcidid><orcidid>https://orcid.org/0000-0002-4493-6113</orcidid><orcidid>https://orcid.org/0000-0003-4239-4960</orcidid><orcidid>https://orcid.org/0000-0003-3563-6781</orcidid><orcidid>https://orcid.org/0000-0001-9342-9014</orcidid><orcidid>https://orcid.org/0000-0002-6018-0483</orcidid><orcidid>https://orcid.org/0000-0002-4948-5550</orcidid><orcidid>https://orcid.org/0000-0003-1904-1753</orcidid><orcidid>https://orcid.org/0000-0003-1872-7614</orcidid><orcidid>https://orcid.org/0000-0003-0177-7295</orcidid><orcidid>https://orcid.org/0000-0003-4015-3151</orcidid><orcidid>https://orcid.org/0000-0003-0016-3531</orcidid><orcidid>https://orcid.org/0000-0002-6183-4493</orcidid><orcidid>https://orcid.org/0000-0001-7902-3755</orcidid></search><sort><creationdate>20221015</creationdate><title>Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T</title><author>Colas, Jaron T. ; Dundon, Neil M. ; Gerraty, Raphael T. ; Saragosa‐Harris, Natalie M. ; Szymula, Karol P. ; Tanwisuth, Koranis ; Tyszka, J. Michael ; Geen, Camilla ; Ju, Harang ; Toga, Arthur W. ; Gold, Joshua I. ; Bassett, Dani S. ; Hartley, Catherine A. ; Shohamy, Daphna ; Grafton, Scott T. ; O'Doherty, John P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3888-342db4459c81c24224712e5e31b23f96d313a2c12494c6d4f79159ce8ec68af73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Cognitive ability</topic><topic>cognitive map</topic><topic>Cognitive maps</topic><topic>Cognitive models</topic><topic>Cognitive tasks</topic><topic>counterfactual learning</topic><topic>Dopamine receptors</topic><topic>dopaminergic midbrain</topic><topic>Functional magnetic resonance imaging</topic><topic>generalization</topic><topic>Generalization, Psychological</topic><topic>hippocampus</topic><topic>Humans</topic><topic>individual differences</topic><topic>Learning</topic><topic>Machine learning</topic><topic>Magnetic Resonance Imaging - methods</topic><topic>Mesencephalon</topic><topic>model‐free and model‐based</topic><topic>multifield fMRI</topic><topic>Neostriatum</topic><topic>Neurosciences</topic><topic>Reinforcement</topic><topic>reinforcement learning</topic><topic>Reinforcement, Psychology</topic><topic>Reversal learning</topic><topic>Reward</topic><topic>striatum</topic><topic>Structural hierarchy</topic><topic>Substantia nigra</topic><topic>Ventral tegmentum</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Colas, Jaron T.</creatorcontrib><creatorcontrib>Dundon, Neil M.</creatorcontrib><creatorcontrib>Gerraty, Raphael T.</creatorcontrib><creatorcontrib>Saragosa‐Harris, Natalie M.</creatorcontrib><creatorcontrib>Szymula, Karol P.</creatorcontrib><creatorcontrib>Tanwisuth, Koranis</creatorcontrib><creatorcontrib>Tyszka, J. Michael</creatorcontrib><creatorcontrib>Geen, Camilla</creatorcontrib><creatorcontrib>Ju, Harang</creatorcontrib><creatorcontrib>Toga, Arthur W.</creatorcontrib><creatorcontrib>Gold, Joshua I.</creatorcontrib><creatorcontrib>Bassett, Dani S.</creatorcontrib><creatorcontrib>Hartley, Catherine A.</creatorcontrib><creatorcontrib>Shohamy, Daphna</creatorcontrib><creatorcontrib>Grafton, Scott T.</creatorcontrib><creatorcontrib>O'Doherty, John P.</creatorcontrib><collection>Wiley Online Library Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Chemoreception Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Human brain mapping</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Colas, Jaron T.</au><au>Dundon, Neil M.</au><au>Gerraty, Raphael T.</au><au>Saragosa‐Harris, Natalie M.</au><au>Szymula, Karol P.</au><au>Tanwisuth, Koranis</au><au>Tyszka, J. Michael</au><au>Geen, Camilla</au><au>Ju, Harang</au><au>Toga, Arthur W.</au><au>Gold, Joshua I.</au><au>Bassett, Dani S.</au><au>Hartley, Catherine A.</au><au>Shohamy, Daphna</au><au>Grafton, Scott T.</au><au>O'Doherty, John P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T</atitle><jtitle>Human brain mapping</jtitle><addtitle>Hum Brain Mapp</addtitle><date>2022-10-15</date><risdate>2022</risdate><volume>43</volume><issue>15</issue><spage>4750</spage><epage>4790</epage><pages>4750-4790</pages><issn>1065-9471</issn><issn>1097-0193</issn><eissn>1097-0193</eissn><abstract>The model‐free algorithms of “reinforcement learning” (RL) have gained clout across disciplines, but so too have model‐based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value‐based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations. This “generalized reinforcement learning” (GRL) model, a frugal extension of RL, parsimoniously retains the single reward‐prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal‐learning task with hierarchical structure that encouraged inverse generalization across both states and actions. With high‐resolution high‐field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus.</abstract><cop>Hoboken, USA</cop><pub>John Wiley &amp; Sons, Inc</pub><pmid>35860954</pmid><doi>10.1002/hbm.25988</doi><tpages>41</tpages><orcidid>https://orcid.org/0000-0003-1822-0688</orcidid><orcidid>https://orcid.org/0000-0001-9782-1005</orcidid><orcidid>https://orcid.org/0000-0001-6246-1775</orcidid><orcidid>https://orcid.org/0000-0002-4493-6113</orcidid><orcidid>https://orcid.org/0000-0003-4239-4960</orcidid><orcidid>https://orcid.org/0000-0003-3563-6781</orcidid><orcidid>https://orcid.org/0000-0001-9342-9014</orcidid><orcidid>https://orcid.org/0000-0002-6018-0483</orcidid><orcidid>https://orcid.org/0000-0002-4948-5550</orcidid><orcidid>https://orcid.org/0000-0003-1904-1753</orcidid><orcidid>https://orcid.org/0000-0003-1872-7614</orcidid><orcidid>https://orcid.org/0000-0003-0177-7295</orcidid><orcidid>https://orcid.org/0000-0003-4015-3151</orcidid><orcidid>https://orcid.org/0000-0003-0016-3531</orcidid><orcidid>https://orcid.org/0000-0002-6183-4493</orcidid><orcidid>https://orcid.org/0000-0001-7902-3755</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1065-9471
ispartof Human brain mapping, 2022-10, Vol.43 (15), p.4750-4790
issn 1065-9471
1097-0193
1097-0193
language eng
recordid cdi_proquest_miscellaneous_2692755519
source MEDLINE; Wiley Online Library Open Access; DOAJ Directory of Open Access Journals; Wiley Online Library Journals Frontfile Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central
subjects Algorithms
Artificial intelligence
Cognitive ability
cognitive map
Cognitive maps
Cognitive models
Cognitive tasks
counterfactual learning
Dopamine receptors
dopaminergic midbrain
Functional magnetic resonance imaging
generalization
Generalization, Psychological
hippocampus
Humans
individual differences
Learning
Machine learning
Magnetic Resonance Imaging - methods
Mesencephalon
model‐free and model‐based
multifield fMRI
Neostriatum
Neurosciences
Reinforcement
reinforcement learning
Reinforcement, Psychology
Reversal learning
Reward
striatum
Structural hierarchy
Substantia nigra
Ventral tegmentum
title Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T22%3A05%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20learning%20with%20associative%20or%20discriminative%20generalization%20across%20states%20and%20actions:%20fMRI%20at%203%C2%A0T%20and%207%C2%A0T&rft.jtitle=Human%20brain%20mapping&rft.au=Colas,%20Jaron%20T.&rft.date=2022-10-15&rft.volume=43&rft.issue=15&rft.spage=4750&rft.epage=4790&rft.pages=4750-4790&rft.issn=1065-9471&rft.eissn=1097-0193&rft_id=info:doi/10.1002/hbm.25988&rft_dat=%3Cproquest_cross%3E2716051565%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2716051565&rft_id=info:pmid/35860954&rfr_iscdi=true