Including cognitive biases and distance-based rewards in a connectionist model of complex problem solving

We present a cognitive, connectionist-based model of complex problem solving that integrates cognitive biases and distance-based and environmental rewards under a temporal-difference learning mechanism. The model is tested against experimental data obtained in a well-defined and planning-intensive p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2012, Vol.25 (1), p.41-56
Hauptverfasser:	Dandurand, Frédéric, Shultz, Thomas R., Rey, Arnaud
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Artificial intelligence Cognition - physiology Cognitive biases Cognitive science Computational modeling Computer science control theory systems Distance-reduction heuristic Exact sciences and technology Humans Learning and adaptive systems Neural Networks (Computer) Problem solving Problem Solving - physiology Psychology Reinforcement learning Reward Temporal-difference learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We present a cognitive, connectionist-based model of complex problem solving that integrates cognitive biases and distance-based and environmental rewards under a temporal-difference learning mechanism. The model is tested against experimental data obtained in a well-defined and planning-intensive problem. We show that incorporating cognitive biases (symmetry and simplicity) in a temporal-difference learning rule (SARSA) increases model adequacy—the solution space explored by biased models better fits observed human solutions. While learning from explicit rewards alone is intrinsically slow, adding distance-based rewards, a measure of closeness to goal, to the learning rule significantly accelerates learning. Finally, the model correctly predicts that explicit rewards have little impact on problem solvers’ ability to discover optimal solutions. ► We present a temporal-difference-based model of human problem solving. ► The model includes distance-based rewards (DBR), a measure of closeness to goal. ► With DBR, environmental rewards are not necessary for learning the task. ► Incorporating cognitive biases (symmetry and simplicity) increases model adequacy.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/j.neunet.2011.06.021