Systems Control With Generalized Probabilistic Fuzzy-Reinforcement Learning

Reinforcement learning (RL) is a valuable learning method when the systems require a selection of control actions whose consequences emerge over long periods for which input-output data are not available. In most combinations of fuzzy systems and RL, the environment is considered to be deterministic...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on fuzzy systems 2011-02, Vol.19 (1), p.51-64
Hauptverfasser:	Hinojosa, W M, Nefti, S, Kaymak, U
Format:	Artikel
Sprache:	eng
Schlagworte:	Actor-critic (AC) Control systems Convergence Function approximation Fuzzy logic Fuzzy systems Learning learning agent probabilistic fuzzy systems Probabilistic logic Probabilistic methods Probability theory Reinforcement reinforcement learning (RL) Stochastic processes Studies systems control Uncertainty
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Reinforcement learning (RL) is a valuable learning method when the systems require a selection of control actions whose consequences emerge over long periods for which input-output data are not available. In most combinations of fuzzy systems and RL, the environment is considered to be deterministic. In many problems, however, the consequence of an action may be uncertain or stochastic in nature. In this paper, we propose a novel RL approach to combine the universal-function-approximation capability of fuzzy systems with consideration of probability distributions over possible consequences of an action. The proposed generalized probabilistic fuzzy RL (GPFRL) method is a modified version of the actor-critic (AC) learning architecture. The learning is enhanced by the introduction of a probability measure into the learning structure, where an incremental gradient-descent weight-updating algorithm provides convergence. Our results show that the proposed approach is robust under probabilistic uncertainty while also having an enhanced learning speed and good overall performance.
ISSN:	1063-6706 1941-0034
DOI:	10.1109/TFUZZ.2010.2081994