K-mixup: Data augmentation for offline reinforcement learning using mixup in a Koopman invariant subspace
In this study, we propose a new data augmentation technique, Koopman-mixup (K-mixup), to improve the learning stability and final performance of offline reinforcement learning (RL) algorithms. K-mixup learns a Koopman invariant subspace to incorporate mixup augmentation, commonly used for classifica...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2023-09, Vol.225, p.120136, Article 120136 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this study, we propose a new data augmentation technique, Koopman-mixup (K-mixup), to improve the learning stability and final performance of offline reinforcement learning (RL) algorithms. K-mixup learns a Koopman invariant subspace to incorporate mixup augmentation, commonly used for classification tasks, into an RL framework. Mixup augmentation itself is known to be incompatible with RL because RL generally uses nonlinearly propagating state-based sequential inputs, whereas mixup relies on linear interpolation between a pair of inputs. To resolve the problem, Koopman embedding is used to convert a nonlinear system to a linear system, allowing successful mixup on arbitrary data pairs in any dataset. We evaluate the performance of K-mixup on several OpenAI Gym benchmark control simulations and compare it with the performance of other data augmentation methods. The comparison shows that only the proposed K-mixup consistently outperforms the base offline RL algorithm (CQL). |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.120136 |