Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration

We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters 2024-10, Vol.9 (10), p.8531-8538
Hauptverfasser:	Wakayama, Shohei, Ahmed, Nisar
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bayes methods Bayesian analysis Decision making Error analysis Error correction Free energy human-robot collaboration Multi-armed bandit problems Optimization Parameter robustness Parameter uncertainty Probabilistic inference Probabilistic logic Probabilistic models Robot sensing systems Robustness Semantics Statistical inference Uncertainty
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2024.3448133