Connectionist Models of Reinforcement, Imitation, and Instruction in Learning to Solve Complex Problems

We compared computational models and human performance on learning to solve a high-level, planning-intensive problem. Humans and models were subjected to three learning regimes: reinforcement, imitation, and instruction. We modeled learning by reinforcement (rewards) using SARSA, a softmax selection...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on autonomous mental development 2009-08, Vol.1 (2), p.110-121
Hauptverfasser:	Dandurand, F., Shultz, T.R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Approximation Biological system modeling Cognitive science Computational modeling Computer aided instruction High performance computing Human Humans Knowledge base Learning Learning systems Mathematical models Neural networks Problem-solving Psychology Reinforcement Supervised learning Tasks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We compared computational models and human performance on learning to solve a high-level, planning-intensive problem. Humans and models were subjected to three learning regimes: reinforcement, imitation, and instruction. We modeled learning by reinforcement (rewards) using SARSA, a softmax selection criterion and a neural network function approximator; learning by imitation using supervised learning in a neural network; and learning by instructions using a knowledge-based neural network. We had previously found that human participants who were told if their answers were correct or not (a reinforcement group) were less accurate than participants who watched demonstrations of successful solutions of the task (an imitation group) and participants who read instructions explaining how to solve the task. Furthermore, we had found that humans who learn by imitation and instructions performed more complex solution steps than those trained by reinforcement. Our models reproduced this pattern of results.
ISSN:	1943-0604 2379-8920 1943-0612 2379-8939
DOI:	10.1109/TAMD.2009.2031234