Moratorium Effect on Estimation Values in Simple Reinforcement Learning

In this article, the authors have introduced low-priority cut-in (moratorium) to chain form reinforcement learning, which they proposed as Simple Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite nu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International Journal of Computer Science and Artificial Intelligence 2013-09, Vol.3 (3), p.112-119
Hauptverfasser: Notsu, Akira, Tezuka, Yuki, Honda, Katsuhiro
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this article, the authors have introduced low-priority cut-in (moratorium) to chain form reinforcement learning, which they proposed as Simple Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memory and learning time. To solve the problem, better estimated values are categorized as "GOOD" in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed, because they are regarded as an important sequence themselves. However, the method is heavily affected by the action policy. If an agent tends to search many states, its memory overflows with low-value data. Thus, low-priority cut-in (moratorium) enhances the method, in order to solve this problem. The authors conducted some simulations and observed the influence of their methods. Several simulation results show good influence on learning.
ISSN:2226-4450
2226-4469
2226-4469
2226-4450
DOI:10.5963/IJCSAI0303004