A sub-action aided deep reinforcement learning framework for latency-sensitive network slicing
Network slicing is a core technique of fifth-generation (5G) systems and beyond. To maximize the number of accepted network slices with limited hardware resources, service providers must avoid over-provisioning of quality-of-service (QoS), which could prevent them from lowering capital expenditures...
Gespeichert in:
Veröffentlicht in: | Computer networks (Amsterdam, Netherlands : 1999) Netherlands : 1999), 2022-11, Vol.217, p.109279, Article 109279 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Network slicing is a core technique of fifth-generation (5G) systems and beyond. To maximize the number of accepted network slices with limited hardware resources, service providers must avoid over-provisioning of quality-of-service (QoS), which could prevent them from lowering capital expenditures (CAPEX)/operating expenses (OPEX) for 5G infrastructure. In this paper, we propose a sub-action aided double deep Q-network (SADDQN)-based network slicing algorithm for latency-aware services. Specifically, we model network slicing as a Markov decision process (MDP), where we consider virtual network function (VNF) placements to be the actions of the MDP, and define a reward function based on cost and service priority. Furthermore, we adopt the Dijkstra algorithm to determine the forwarding graph (FG) embedding for a given VNF placement and design a resource allocation algorithm – binary search assisted gradient descent (BSAGD) – to allocate resources to VNFs given the VNF-FG placement. For every service request, we first use the DDQN to choose an MDP action to determine the VNF placement (main action). Next, we employ the Dijkstra algorithm (first-phase sub-action) to find the shortest path for each pair of adjacent VNFs in the given VNF chain. Finally, we implement the BSAGD (second-phase sub-action) to realize this service with the minimum cost. The joint action results in an MDP reward that can be utilized to train the DDQN. Numerical evaluations show that, compared to state-of-the-art algorithms, the proposed algorithm can improve the cost-efficiency while giving priority to higher-priority services and maximizing the acceptance ratio. |
---|---|
ISSN: | 1389-1286 1872-7069 |
DOI: | 10.1016/j.comnet.2022.109279 |