K-mixup: Data augmentation for offline reinforcement learning using mixup in a Koopman invariant subspace

In this study, we propose a new data augmentation technique, Koopman-mixup (K-mixup), to improve the learning stability and final performance of offline reinforcement learning (RL) algorithms. K-mixup learns a Koopman invariant subspace to incorporate mixup augmentation, commonly used for classifica...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2023-09, Vol.225, p.120136, Article 120136
Hauptverfasser: Jang, Junwoo, Han, Jungwoo, Kim, Jinwhan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this study, we propose a new data augmentation technique, Koopman-mixup (K-mixup), to improve the learning stability and final performance of offline reinforcement learning (RL) algorithms. K-mixup learns a Koopman invariant subspace to incorporate mixup augmentation, commonly used for classification tasks, into an RL framework. Mixup augmentation itself is known to be incompatible with RL because RL generally uses nonlinearly propagating state-based sequential inputs, whereas mixup relies on linear interpolation between a pair of inputs. To resolve the problem, Koopman embedding is used to convert a nonlinear system to a linear system, allowing successful mixup on arbitrary data pairs in any dataset. We evaluate the performance of K-mixup on several OpenAI Gym benchmark control simulations and compare it with the performance of other data augmentation methods. The comparison shows that only the proposed K-mixup consistently outperforms the base offline RL algorithm (CQL).
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.120136