Reinforcement learning with internal expectation for the random neural network

The reinforcement learning scheme proposed in Halici (1977) (Halici, U., 1997. Journal of Biosystems 40 (1/2), 83–91) for the random neural network (Gelenbe, E., 1989b. Neural Computation 1 (4), 502–510) is based on reward and performs well for stationary environments. However, when the environment...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	European journal of operational research 2000-10, Vol.126 (2), p.288-307
1. Verfasser:	Halici, Ugur
Format:	Artikel
Sprache:	eng
Schlagworte:	Behavior modification Expectation Extinction Learning Markov analysis Neural networks Operations research Punishment Random neural networks Reinforcement learning Simulation Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The reinforcement learning scheme proposed in Halici (1977) (Halici, U., 1997. Journal of Biosystems 40 (1/2), 83–91) for the random neural network (Gelenbe, E., 1989b. Neural Computation 1 (4), 502–510) is based on reward and performs well for stationary environments. However, when the environment is not stationary it suffers from getting stuck to the previously learned action and extinction is not possible. In this paper, the reinforcement learning scheme is extended by introducing a weight update rule which takes into consideration the internal expectation of reinforcement. With the proposed scheme, the system behaves as in learning with reward when the reward for the learned action is not below the internal expectation, otherwise it behaves as in learning with punishment so that other possibilities can be explored. Such a scheme has made extinction possible while resulting in a good convergence to the most rewarding action.
ISSN:	0377-2217 1872-6860
DOI:	10.1016/S0377-2217(99)00479-8