Not All Errors Are Made Equal: A Regret Metric for Detecting System-level Trajectory Prediction Failures
Robot decision-making increasingly relies on data-driven human prediction models when operating around people. While these models are known to mispredict in out-of-distribution interactions, only a subset of prediction errors impact downstream robot performance. We propose characterizing such "...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Robot decision-making increasingly relies on data-driven human prediction
models when operating around people. While these models are known to mispredict
in out-of-distribution interactions, only a subset of prediction errors impact
downstream robot performance. We propose characterizing such "system-level"
prediction failures via the mathematical notion of regret: high-regret
interactions are precisely those in which mispredictions degraded closed-loop
robot performance. We further introduce a probabilistic generalization of
regret that calibrates failure detection across disparate deployment contexts
and renders regret compatible with reward-based and reward-free (e.g.,
generative) planners. In simulated autonomous driving interactions and social
navigation interactions deployed on hardware, we showcase that our system-level
failure metric can be used offline to automatically extract closed-loop
human-robot interactions that state-of-the-art generative human predictors and
robot planners previously struggled with. We further find that the very
presence of high-regret data during human predictor fine-tuning is highly
predictive of robot re-deployment performance improvements. Fine-tuning with
the informative but significantly smaller high-regret data (23% of deployment
data) is competitive with fine-tuning on the full deployment dataset,
indicating a promising avenue for efficiently mitigating system-level
human-robot interaction failures. Project website:
https://cmu-intentlab.github.io/not-all-errors/ |
---|---|
DOI: | 10.48550/arxiv.2403.04745 |