Online Frequency Scheduling by Learning Parallel Actions
Radio Resource Management is a challenging topic in future 6G networks where novel applications create strong competition among the users for the available resources. In this work we consider the frequency scheduling problem in a multi-user MIMO system. Frequency resources need to be assigned to a s...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Radio Resource Management is a challenging topic in future 6G networks where
novel applications create strong competition among the users for the available
resources. In this work we consider the frequency scheduling problem in a
multi-user MIMO system. Frequency resources need to be assigned to a set of
users while allowing for concurrent transmissions in the same sub-band.
Traditional methods are insufficient to cope with all the involved constraints
and uncertainties, whereas reinforcement learning can directly learn
near-optimal solutions for such complex environments. However, the scheduling
problem has an enormous action space accounting for all the combinations of
users and sub-bands, so out-of-the-box algorithms cannot be used directly. In
this work, we propose a scheduler based on action-branching over sub-bands,
which is a deep Q-learning architecture with parallel decision capabilities.
The sub-bands learn correlated but local decision policies and altogether they
optimize a global reward. To improve the scaling of the architecture with the
number of sub-bands, we propose variations (Unibranch, Graph Neural
Network-based) that reduce the number of parameters to learn. The parallel
decision making of the proposed architecture allows to meet short inference
time requirements in real systems. Furthermore, the deep Q-learning approach
permits online fine-tuning after deployment to bridge the sim-to-real gap. The
proposed architectures are evaluated against relevant baselines from the
literature showing competitive performance and possibilities of online
adaptation to evolving environments. |
---|---|
DOI: | 10.48550/arxiv.2406.05041 |