Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies. However, learning a highly performant universal policy requires sophisticated architectures like transformers (TF) that have large...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Learning a universal policy across different robot morphologies can
significantly improve learning efficiency and enable zero-shot generalization
to unseen morphologies. However, learning a highly performant universal policy
requires sophisticated architectures like transformers (TF) that have larger
memory and computational cost than simpler multi-layer perceptrons (MLP). To
achieve both good performance like TF and high efficiency like MLP at inference
time, we propose HyperDistill, which consists of: (1) A morphology-conditioned
hypernetwork (HN) that generates robot-wise MLP policies, and (2) A policy
distillation approach that is essential for successful training. We show that
on UNIMAL, a benchmark with hundreds of diverse morphologies, HyperDistill
performs as well as a universal TF teacher policy on both training and unseen
test robots, but reduces model size by 6-14 times, and computational cost by
67-160 times in different environments. Our analysis attributes the efficiency
advantage of HyperDistill at inference time to knowledge decoupling, i.e., the
ability to decouple inter-task and intra-task knowledge, a general principle
that could also be applied to improve inference efficiency in other domains. |
---|---|
DOI: | 10.48550/arxiv.2402.06570 |