Characterization of the optimal average cost in Markov decision chains driven by a risk-seeking controller

This work concerns Markov decision chains on a denumerable state space endowed with a bounded cost function. The performance of a control policy is assessed by a long-run average criterion as measured by a risk-seeking decision maker with constant risk-sensitivity. Besides standard continuity–compac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of applied probability 2024-03, Vol.61 (1), p.340-367
Hauptverfasser:	Cavazos-Cadena, Rolando, Cruz-Suárez, Hugo, Montes-de-Oca, Raúl
Format:	Artikel
Sprache:	eng
Schlagworte:	Cost function Dynamical systems Expected values Markov chains Mathematical functions Original Article Random variables Risk communication Utility functions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This work concerns Markov decision chains on a denumerable state space endowed with a bounded cost function. The performance of a control policy is assessed by a long-run average criterion as measured by a risk-seeking decision maker with constant risk-sensitivity. Besides standard continuity–compactness conditions, the framework of the paper is determined by the following conditions: (i) the state process is communicating under each stationary policy, and (ii) the simultaneous Doeblin condition holds. Within this framework it is shown that (i) the optimal superior and inferior limit average value functions coincide and are constant, and (ii) the optimal average cost is characterized via an extended version of the Collatz–Wielandt formula in the theory of positive matrices.
ISSN:	0021-9002 1475-6072
DOI:	10.1017/jpr.2023.40