Composing Synergistic Macro Actions for Reinforcement Learning Agents

Macro actions have been demonstrated to be beneficial for the learning processes of an agent and have encouraged a variety of techniques to be developed for constructing more effective ones. However, previous techniques usually do not further consider combining macro actions to form a synergistic ma...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2024-05, Vol.35 (5), p.7251-7258
Hauptverfasser: Chen, Yu-Ming, Chang, Kaun-Yu, Liu, Chien, Hsiao, Tsu-Ching, Hong, Zhang-Wei, Lee, Chun-Yi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Macro actions have been demonstrated to be beneficial for the learning processes of an agent and have encouraged a variety of techniques to be developed for constructing more effective ones. However, previous techniques usually do not further consider combining macro actions to form a synergistic macro action ensemble, in which synergism exhibits when the constituent macro actions are favorable to be jointly used by an agent during evaluation. Such a synergistic macro action ensemble may potentially allow an agent to perform even better than the individual macro actions within it. Motivated by the recent advances of neural architecture search (NAS), in this brief, we formulate the construction of a synergistic macro action ensemble as a Markov decision process (MDP) and evaluate the constructed macro action ensemble as a whole. Such a problem formulation enables synergism to be taken into account by the proposed evaluation procedure. Our experimental results demonstrate that the proposed framework is able to discover the synergistic macro action ensembles. Furthermore, we also highlight the benefits of these macro action ensembles through a set of analytical cases.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2022.3213606