Robust Near-Optimal Arm Identification With Strongly-Adaptive Adversaries

In this work, we study the best arm identification problem in the adversarial multi-armed bandits framework. We define a strongly-adaptive adversarial model in this framework, based on strongly-adaptive adversaries in security and distributed systems. On the negative side, we show the increased stre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on signal processing 2023-01, Vol.71, p.1-16
Hauptverfasser:	Sridhar, Mayuri, Devadas, Srinivas
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Algorithms best arm identification Complexity theory Computer networks Costs Multi-armed bandit problem Multi-armed bandits Prediction algorithms Robustness sequential decision-making Signal processing algorithms strongly-adaptive adversaries
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this work, we study the best arm identification problem in the adversarial multi-armed bandits framework. We define a strongly-adaptive adversarial model in this framework, based on strongly-adaptive adversaries in security and distributed systems. On the negative side, we show the increased strength of the adversarial model by proving that it is impossible for any best-arm identification algorithm to return an arm with rank \leqslant \left\lfloor {\frac{{ϵK}}{{1 + {ϵ_0}}} \right\rfloor , where K is the number of arms, ϵ is the adversary's budget and ϵ 0 is the breaking point of the robust mean estimation subroutine. On the positive side, we construct a novel sequential elimination algorithm which returns a near-optimal arm (with rank \leqslant \left\lceil {\left( {1 + \lambda } \right)\left\lfloor {\frac{{ϵK}}{{{ϵ_0}}} \right\rfloor } \right\rceil where λ > 0 is a function of ϵ and ϵ 0 and tends to 0 for small ϵ ) with high probability. We evaluate our algorithm on both synthetic and real-world datasets and empirically demonstrate that our algorithm returns a near-optimal arm under a strongly-adaptive adversarial model.
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2023.3330009