Time-Variant Variational Transfer for Value Functions
In most of the transfer learning approaches to reinforcement learning (RL) the distribution over the tasks is assumed to be stationary. Therefore, the target and source tasks are i.i.d. samples of the same distribution. In the context of this work, we consider the problem of transferring value funct...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In most of the transfer learning approaches to reinforcement learning (RL)
the distribution over the tasks is assumed to be stationary. Therefore, the
target and source tasks are i.i.d. samples of the same distribution. In the
context of this work, we consider the problem of transferring value functions
through a variational method when the distribution that generates the tasks is
time-variant, proposing a solution that leverages this temporal structure
inherent in the task generating process. Furthermore, by means of a
finite-sample analysis, the previously mentioned solution is theoretically
compared to its time-invariant version. Finally, we will provide an
experimental evaluation of the proposed technique with three distinct temporal
dynamics in three different RL environments. |
---|---|
DOI: | 10.48550/arxiv.2005.12864 |