A Policy-Improving System with a Mixture of Bayesian Networks Adapting Agents to Continuously Changing Environments

A variety of adaptive learning systems which adapt themselves to complicated environments has been studied and developed in the broad field of AI researches. For example, many reinforcement learning (RL) methods have been proposed to adapt agents to the environments. At the same time, Bayesian netwo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kitakoshi, D., Shioya, H., Nakano, R.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Adapting to continuously-changing environments Adaptive systems Artificial intelligence Bayesian methods Knowledge representation Learning systems Mixture of Bayesian networks Noise robustness Power system modeling Reinforcement learing Stochastic knowledge representation Stochastic resonance Stochastic systems Working environment noise
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A variety of adaptive learning systems which adapt themselves to complicated environments has been studied and developed in the broad field of AI researches. For example, many reinforcement learning (RL) methods have been proposed to adapt agents to the environments. At the same time, Bayesian network (BN), one of the stochastic models, has attracted increasing attention due to its noise robustness, reasoning power, etc. We have proposed a system improving RL agents' policies with a mixture model of RNs, and have evaluated the adapting performance of our system. Each structure of BN can be regarded as a stochastic knowledge representation in the policy acquired through RL. It has been confirmed that the agent with our system could improve their policies by the information derived from the mixture, and then could adequately adapt to dynamically-switched environments. In this research, we propose a method to appropriately normalize mixing parameters of the mixture for the use in common adaptive learning systems, and evaluate the fundamental performance of our system in continuously-changing environment
DOI:	10.1109/SICE.2006.315202