A probabilistic analysis of bias optimality in unichain Markov decision processes

Focuses on bias optimality in unichain, finite state, and action-space Markov decision processes. Using relative value functions, we present methods for evaluating optimal bias, this leads to a probabilistic analysis which transforms the original reward problem into a minimum average cost problem. T...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control 2001-01, Vol.46 (1), p.96-100
Hauptverfasser: Lewis, M.E., Puterman, M.L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Focuses on bias optimality in unichain, finite state, and action-space Markov decision processes. Using relative value functions, we present methods for evaluating optimal bias, this leads to a probabilistic analysis which transforms the original reward problem into a minimum average cost problem. The result is an explanation of how and why bias implicitly discounts future rewards.
ISSN:0018-9286
1558-2523
DOI:10.1109/9.898698