Empirical measure large deviations for reinforced chains on finite spaces

Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Systems & control letters 2022-11, Vol.169, p.105379, Article 105379
Hauptverfasser: Budhiraja, Amarjit, Waterbury, Adam
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Let A be a transition probability kernel on a finite state space Δo={1,…,d} such that A(x,y)>0 for all x,y∈Δo. Consider a reinforced chain given as a sequence {Xn,n∈N0} of Δo-valued random variables, defined recursively according to, Ln=1n∑i=0n−1δXi,P(Xn∈⋅∣X0,…,Xn−1)=LnA(⋅). We establish a large deviation principle for {Ln,n∈N}. The rate function takes a strikingly different form than the Donsker–Varadhan rate function associated with the empirical measure of the Markov chain with transition kernel A and is described in terms of a novel deterministic infinite horizon discounted cost control problem with an associated linear controlled dynamics and a nonlinear running cost involving the relative entropy function. Proofs are based on an analysis of time-reversal of controlled dynamics in representations for log-transforms of exponential moments, and on weak convergence methods.
ISSN:0167-6911
1872-7956
DOI:10.1016/j.sysconle.2022.105379