CaRL: Cascade Reinforcement Learning with State Space Splitting for O-RAN based Traffic Steering
The Open Radio Access Network (O-RAN) architecture empowers intelligent and automated optimization of the RAN through applications deployed on the RAN Intelligent Controller (RIC) platform, enabling capabilities beyond what is achievable with traditional RAN solutions. Within this paradigm, Traffic...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Open Radio Access Network (O-RAN) architecture empowers intelligent and
automated optimization of the RAN through applications deployed on the RAN
Intelligent Controller (RIC) platform, enabling capabilities beyond what is
achievable with traditional RAN solutions. Within this paradigm, Traffic
Steering (TS) emerges as a pivotal RIC application that focuses on optimizing
cell-level mobility settings in near-real-time, aiming to significantly improve
network spectral efficiency. In this paper, we design a novel TS algorithm
based on a Cascade Reinforcement Learning (CaRL) framework. We propose state
space factorization and policy decomposition to reduce the need for large
models and well-labeled datasets. For each sub-state space, an RL sub-policy
will be trained to learn an optimized mapping onto the action space. To apply
CaRL on new network regions, we propose a knowledge transfer approach to
initialize a new sub-policy based on knowledge learned by the trained policies.
To evaluate CaRL, we build a data-driven and scalable RIC digital twin (DT)
that is modeled using important real-world data, including network
configuration, user geo-distribution, and traffic demand, among others, from a
tier-1 mobile operator in the US. We evaluate CaRL on two DT scenarios
representing two network clusters in two different cities and compare its
performance with the business-as-usual (BAU) policy and other competing
optimization approaches using heuristic and Q-table algorithms. Benchmarking
results show that CaRL performs the best and improves the average
cluster-aggregated downlink throughput over the BAU policy by 24% and 18% in
these two scenarios, respectively. |
---|---|
DOI: | 10.48550/arxiv.2312.01970 |