Q-Learning with Continuous State Spaces and Finite Decision Set

This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced in 1989 by Watkins for discrete Markov decision problems. O...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Barty, K., Girardeau, P., Roy, J.-S., Strugarek, C.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Approximation algorithms Costs Dynamic programming Kernel Learning Random variables Recursive estimation State-space methods Stochastic processes Uncertainty
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced in 1989 by Watkins for discrete Markov decision problems. Our algorithm relies on stochastic approximation and functional estimation, and uses kernels to locally update the Q-functions. We state under mild assumptions a converge theorem for this algorithm. Finally, we illustrate our algorithm by solving two classical problems: the mountain car task and the puddle world task
ISSN:	2325-1824 2325-1867
DOI:	10.1109/ADPRL.2007.368209