Approximate Markov Perfect Equilibrium of Joint Offloading Policy for Multi-IV Using Reward-Shared Distributed Method

In this article, we investigate the problem of optimizing the joint offloading policy in a distributed manner for multiple intelligent vehicles (IVs). During the journey in vehicular edge computing (VEC) networks, IVs continually optimize their joint offloading policy to minimize the long-term accum...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent vehicles 2024-02, Vol.9 (2), p.3658-3671
Hauptverfasser: Li, Chao, Liu, Fagui, Wang, Bin, Tang, Xuhao, Liu, Jie, Chen, C. L. Philip
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this article, we investigate the problem of optimizing the joint offloading policy in a distributed manner for multiple intelligent vehicles (IVs). During the journey in vehicular edge computing (VEC) networks, IVs continually optimize their joint offloading policy to minimize the long-term accumulated costs generated by executing computational tasks. The stochastic and repetitive interactions among IVs is modeled as a Markov game process. In this way, the optimization of the joint offloading policy is transformed to approximate a Markov perfect equilibrium in a general-sum Markov game. Moreover, we argue that training in the practical VEC networks using the classical centralized training and decentralized executing (CTDE) framework involves challenges of privacy and computational complexity. Motivated by these, we propose a reward-shared distributed policy optimization (RSDPO) method for the considered VEC networks to optimize the joint offloading policy. The experimental results demonstrate that the set of joint offloading policies using RSDPO approximates a Markov perfect equilibrium, and our RSDPO presents significant advantages in terms of converged latency and energy consumption compared with other methods.
ISSN:2379-8858
2379-8904
DOI:10.1109/TIV.2024.3352422