Adapting to Mixing Time in Stochastic Optimization with Markovian Data
We consider stochastic optimization problems where data is drawn from a Markov chain. Existing methods for this setting crucially rely on knowing the mixing time of the chain, which in real-world applications is usually unknown. We propose the first optimization method that does not require the know...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We consider stochastic optimization problems where data is drawn from a
Markov chain. Existing methods for this setting crucially rely on knowing the
mixing time of the chain, which in real-world applications is usually unknown.
We propose the first optimization method that does not require the knowledge of
the mixing time, yet obtains the optimal asymptotic convergence rate when
applied to convex problems. We further show that our approach can be extended
to: (i) finding stationary points in non-convex optimization with Markovian
data, and (ii) obtaining better dependence on the mixing time in temporal
difference (TD) learning; in both cases, our method is completely oblivious to
the mixing time. Our method relies on a novel combination of multi-level Monte
Carlo (MLMC) gradient estimation together with an adaptive learning method. |
---|---|
DOI: | 10.48550/arxiv.2202.04428 |