Meta-Learning through Hebbian Plasticity in Random Networks
Advances in Neural Information Processing Systems (2020) Lifelong learning and adaptability are two defining aspects of biological agents. Modern reinforcement learning (RL) approaches have shown significant progress in solving complex tasks, however once training is concluded, the found solutions a...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Advances in Neural Information Processing Systems (2020) Lifelong learning and adaptability are two defining aspects of biological
agents. Modern reinforcement learning (RL) approaches have shown significant
progress in solving complex tasks, however once training is concluded, the
found solutions are typically static and incapable of adapting to new
information or perturbations. While it is still not completely understood how
biological brains learn and adapt so efficiently from experience, it is
believed that synaptic plasticity plays a prominent role in this process.
Inspired by this biological mechanism, we propose a search method that, instead
of optimizing the weight parameters of neural networks directly, only searches
for synapse-specific Hebbian learning rules that allow the network to
continuously self-organize its weights during the lifetime of the agent. We
demonstrate our approach on several reinforcement learning tasks with different
sensory modalities and more than 450K trainable plasticity parameters. We find
that starting from completely random weights, the discovered Hebbian rules
enable an agent to navigate a dynamical 2D-pixel environment; likewise they
allow a simulated 3D quadrupedal robot to learn how to walk while adapting to
morphological damage not seen during training and in the absence of any
explicit reward or error signal in less than 100 timesteps. Code is available
at https://github.com/enajx/HebbianMetaLearning. |
---|---|
DOI: | 10.48550/arxiv.2007.02686 |