Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

Meta Reinforcement Learning (Meta RL) trains agents that adapt to fast-changing environments and tasks. Current strategies often lose adaption efficiency due to the passive nature of model exploration, causing delayed understanding of new transition dynamics. This results in particularly fast-evolvi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-11
Hauptverfasser:	Jacobson, Maxwell Joseph, Xue, Yexiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Changing environments Evolution Hypotheses Model accuracy
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Meta Reinforcement Learning (Meta RL) trains agents that adapt to fast-changing environments and tasks. Current strategies often lose adaption efficiency due to the passive nature of model exploration, causing delayed understanding of new transition dynamics. This results in particularly fast-evolving tasks being impossible to solve. We propose a novel approach, Hypothesis Network Planned Exploration (HyPE), that integrates an active and planned exploration process via the hypothesis network to optimize adaptation speed. HyPE uses a generative hypothesis network to form potential models of state transition dynamics, then eliminates incorrect models through strategically devised experiments. Evaluated on a symbolic version of the Alchemy game, HyPE outpaces baseline methods in adaptation speed and model accuracy, validating its potential in enhancing reinforcement learning adaptation in rapidly evolving settings.
ISSN:	2331-8422