Inverse Constraint Learning and Generalization by Transferable Reward Decomposition
We present the problem of inverse constraint learning (ICL), which recovers constraints from demonstrations to autonomously reproduce constrained skills in new scenarios. However, ICL suffers from an ill-posed nature, leading to inaccurate inference of constraints from demonstrations. To figure it o...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present the problem of inverse constraint learning (ICL), which recovers
constraints from demonstrations to autonomously reproduce constrained skills in
new scenarios. However, ICL suffers from an ill-posed nature, leading to
inaccurate inference of constraints from demonstrations. To figure it out, we
introduce a transferable constraint learning (TCL) algorithm that jointly
infers a task-oriented reward and a task-agnostic constraint, enabling the
generalization of learned skills. Our method TCL additively decomposes the
overall reward into a task reward and its residual as soft constraints,
maximizing policy divergence between task- and constraint-oriented policies to
obtain a transferable constraint. Evaluating our method and five baselines in
three simulated environments, we show TCL outperforms state-of-the-art IRL and
ICL algorithms, achieving up to a $72\%$ higher task-success rates with
accurate decomposition compared to the next best approach in novel scenarios.
Further, we demonstrate the robustness of TCL on two real-world robotic tasks. |
---|---|
DOI: | 10.48550/arxiv.2306.12357 |