Push-sum Distributed Dual Averaging Online Convex Optimization With Bandit Feedback
This paper investigates the distributed online convex optimization problem in multi-agent systems, where each node cannot directly access the gradient information of its own cost function. The communication topology is formed by the strongly connected time-varying directed graphs with the column sto...
Gespeichert in:
Veröffentlicht in: | International journal of control, automation, and systems automation, and systems, 2024-05, Vol.22 (5), p.1461-1471 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper investigates the distributed online convex optimization problem in multi-agent systems, where each node cannot directly access the gradient information of its own cost function. The communication topology is formed by the strongly connected time-varying directed graphs with the column stochastic weight matrices, where each node updates its own decisions by exchanging information with neighbouring nodes. It is not feasible to sample objective function values at several consecutive points simultaneously since the online setting is time-varying. To solve this problem over directed graphs, a push-sum one-point bandit distributed dual averaging (PS-OBDDA) algorithm is proposed, where the one-point gradient estimator is employed to estimate the true gradient information, to guide the updating of the decision variables. Moreover, by selecting the appropriate exploration parameter
δ
and step sizes
α
(
t
), the algorithm is shown to achieve the sublinear regret bound with the convergence rate
O
(
T
5
6
)
. Furthermore, the effect of one-point estimation parameters on the regret of the algorithm in online settings is explored. Finally, the performance of the algorithm is evaluated through simulation. |
---|---|
ISSN: | 1598-6446 2005-4092 |
DOI: | 10.1007/s12555-023-0211-3 |