MODEL BASED REINFORCEMENT LEARNING BASED ON GENERALIZED HIDDEN PARAMETER MARKOV DECISION PROCESSES
A machine learning model for reinforcement learning uses parameterized families of Markov decision processes (MDP) with latent variables. The system uses latent variables to improve ability of models to transfer knowledge and generalize to new tasks. Accordingly, trained machine learning based model...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A machine learning model for reinforcement learning uses parameterized families of Markov decision processes (MDP) with latent variables. The system uses latent variables to improve ability of models to transfer knowledge and generalize to new tasks. Accordingly, trained machine learning based models are able to work in unseen environments or combinations of conditions/factors that the machine learning model was never trained on. For example, robots or self-driving vehicles based on the machine learning based models are robust to changing goals and are able to adapt to novel reward functions or tasks flexibly while being able to transfer knowledge about environments and agents to new tasks. |
---|