World Programs for Model-Based Learning and Planning in Compositional State and Action Spaces

Some of the most important tasks take place in environments which lack cheap and perfect simulators, thus hampering the application of model-free reinforcement learning (RL). While model-based RL aims to learn a dynamics model, in a more general case the learner does not know a priori what the actio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2019-12
1. Verfasser:	Marwin H S Segler
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer simulation Flight simulators Learning Retailing Task complexity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Some of the most important tasks take place in environments which lack cheap and perfect simulators, thus hampering the application of model-free reinforcement learning (RL). While model-based RL aims to learn a dynamics model, in a more general case the learner does not know a priori what the action space is. Here we propose a formalism where the learner induces a world program by learning a dynamics model and the actions in graph-based compositional environments by observing state-state transition examples. Then, the learner can perform RL with the world program as the simulator for complex planning tasks. We highlight a recent application, and propose a challenge for the community to assess world program-based planning.
ISSN:	2331-8422