Autonomous mobile acoustic relay positioning as a multi-armed bandit with switching costs

Underwater acoustic communication channels display highly variable and stochastic performance, especially in multipath-limited shallow-water and harbor environments. A mobile acoustic node can, however, learn the channel's properties as it moves about. Maximizing the cumulative data transmissio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mei Yi Cheung, Leighton, Joshua, Hover, Franz S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Underwater acoustic communication channels display highly variable and stochastic performance, especially in multipath-limited shallow-water and harbor environments. A mobile acoustic node can, however, learn the channel's properties as it moves about. Maximizing the cumulative data transmission through adaptive node positioning is a clean exploitation vs. exploration scenario because learning about poorly characterized locations must be balanced against exploiting known ones. While this problem is well described with the stochastic multi-armed bandit formalism, the classical assumption of costless switching is untenable in the field, where slow-moving vehicles often cover large distances. We present a heuristic adaptation to the MAB Gittins index rule with limited policy enumeration to account for switching costs, and describe field experiments conducted in the Charles River (Boston MA). The field data establish that the MAB and its switching cost extension are tractable in this application, and that performance is consistently superior to that of ϵ-greedy policies.
ISSN:2153-0858
2153-0866
DOI:10.1109/IROS.2013.6696836