Empirical measure large deviations for reinforced chains on finite spaces

Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Systems & control letters 2022-11, Vol.169, p.105379, Article 105379
Hauptverfasser:	Budhiraja, Amarjit, Waterbury, Adam
Format:	Artikel
Sprache:	eng
Schlagworte:	Empirical measure Infinite horizon discounted cost Large deviation principle Reinforced random walks Stochastic approximation Time-reversal
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large deviation principle for {Ln,n∈N}. The rate function takes a strikingly different form than the Donsker–Varadhan rate function associated with the empirical measure of the Markov chain with transition kernel A and is described in terms of a novel deterministic infinite horizon discounted cost control problem with an associated linear controlled dynamics and a nonlinear running cost involving the relative entropy function. Proofs are based on an analysis of time-reversal of controlled dynamics in representations for log-transforms of exponential moments, and on weak convergence methods.
ISSN:	0167-6911 1872-7956
DOI:	10.1016/j.sysconle.2022.105379