Adaptability analysis of genetic network programming with reinforcement learning in dynamically changing environments

► We proposed an adaptive learning mechanism using reinforcement learning. ► The adaptability is examined when some sensors are suddenly broken. ► Faulty sensors are indirectly detected through the changes of Q values. ► The size of Q table is compact, so the quick recovery from troubles is possible...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2012-11, Vol.39 (16), p.12349-12357
Hauptverfasser:	Mabu, Shingo, Tjahjadi, Andre, Hirasawa, Kotaro
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptability Computer simulation Evolutionary computation Genetic network programming Genetics Khepera robot Learning Mathematical models Networks Programming Reinforcement Reinforcement learning Robots
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	► We proposed an adaptive learning mechanism using reinforcement learning. ► The adaptability is examined when some sensors are suddenly broken. ► Faulty sensors are indirectly detected through the changes of Q values. ► The size of Q table is compact, so the quick recovery from troubles is possible. Genetic network programming (GNP) has been proposed as one of the evolutionary algorithms and extended with reinforcement learning (GNP-RL). The combination of evolution and learning can efficiently evolve programs and the fitness improvement has been confirmed in the simulations of tileworld problems, elevator group supervisory control systems, stock trading models and wall following behavior of Khepera robot. However, its adaptability in testing environments, where the situations dynamically change, has not been analyzed in detail yet. In this paper, the adaptation mechanism in the testing environment is introduced and it is confirmed that GNP-RL can adapt to the environmental changes using a robot simulator WEBOTS, especially when unexperienced sensor troubles suddenly occur. The simulation results show that GNP-RL works well in the testing even if wrong sensor information is given because GNP-RL has a function to automatically change programs using alternative actions. In addition, the analysis on the effects of the parameters of GNP-RL is carried out in both training and testing simulations.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2012.04.038