Robots that Suggest Safe Alternatives
Goal-conditioned policies, such as those learned via imitation learning, provide an easy way for humans to influence what tasks robots accomplish. However, these robot policies are not guaranteed to execute safely or to succeed when faced with out-of-distribution requests. In this work, we enable ro...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Goal-conditioned policies, such as those learned via imitation learning,
provide an easy way for humans to influence what tasks robots accomplish.
However, these robot policies are not guaranteed to execute safely or to
succeed when faced with out-of-distribution requests. In this work, we enable
robots to know when they can confidently execute a user's desired goal, and
automatically suggest safe alternatives when they cannot. Our approach is
inspired by control-theoretic safety filtering, wherein a safety filter
minimally adjusts a robot's candidate action to be safe. Our key idea is to
pose alternative suggestion as a safe control problem in goal space, rather
than in action space. Offline, we use reachability analysis to compute a
goal-parameterized reach-avoid value network which quantifies the safety and
liveness of the robot's pre-trained policy. Online, our robot uses the
reach-avoid value network as a safety filter, monitoring the human's given goal
and actively suggesting alternatives that are similar but meet the safety
specification. We demonstrate our Safe ALTernatives (SALT) framework in
simulation experiments with indoor navigation and Franka Panda tabletop
manipulation, and with both discrete and continuous goal representations. We
find that SALT is able to learn to predict successful and failed closed-loop
executions, is a less pessimistic monitor than open-loop uncertainty
quantification, and proposes alternatives that consistently align with those
people find acceptable. |
---|---|
DOI: | 10.48550/arxiv.2409.09883 |