Entropy-based metrics for predicting choice behavior based on local response to reward

For decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedba...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2021-11, Vol.12 (1), p.6567-6567, Article 6567
Hauptverfasser: Trepka, Ethan, Spitmaan, Mehran, Bari, Bilal A., Costa, Vincent D., Cohen, Jeremiah Y., Soltani, Alireza
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:For decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching behavior. To address this, we developed metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct more accurate models of choice. Together, our entropy-based metrics provide a model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms. Animals distribute their choices between alternative options according to relative reinforcement they receive from those options (matching law). Here, the authors propose metrics based on information theory that can predict this global behavioral rule based on local response to reward feedback.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-021-26784-w