Lipschitz Lifelong Reinforcement Learning

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes (MDPs) and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipsch...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lecarpentier, Erwan, Abel, David, Asadi, Kavosh, Jinnai, Yuu, Rachelson, Emmanuel, Littman, Michael L
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!