Smart scheduler: an adaptive NVM-aware thread scheduling approach on NUMA systems
NVM provides large memory capacity, long-term data durability, and high memory bandwidth for multi-thread applications on cloud servers. Nowadays, cloud servers often employ NUMA architecture, where the thread scheduling mechanism plays a vital role in overall system performance because of the NUMA...
Gespeichert in:
Veröffentlicht in: | CCF transactions on high performance computing (Online) 2022-12, Vol.4 (4), p.394-406 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | NVM provides large memory capacity, long-term data durability, and high memory bandwidth for multi-thread applications on cloud servers. Nowadays, cloud servers often employ NUMA architecture, where the thread scheduling mechanism plays a vital role in overall system performance because of the NUMA property. However, with the increase in server resources’ diversity, i.e., hybrid memory systems using DRAM and NVM on NUMA nodes, the exploration space for thread scheduling is expanding rapidly. Unfortunately, the existing thread schedulers, including rule-based algorithms and scheduling domain methods, cannot provide ideal scheduling solutions in such complicated cases. And, those thread schedulers neglect customized heterogeneous memory structures, thus degrading overall system performance. Fortunately, reinforcement learning can choose actions with maximum rewards values in a specific environment, leading the scheduler towards an optimal solution. In this paper, we propose a thread scheduling approach, i.e., Smart Scheduler, by leveraging a reinforcement learning method. Smart Scheduler takes OS event information as input, extends LinUCB to explore the scheduling space, and guides thread-level scheduling. We evaluate Smart Scheduler on the off-the-shelf server equipped with NVM. The experimental results show that the proposed Smart Scheduler can converge faster (usually within 20 actions) than rule-based algorithms and scheduling domain methods and reduce program execution time by up to 59.9%. It also outperforms rule-based algorithms and scheduling domain methods by 4.1% and 19.1% in quality of service latency. |
---|---|
ISSN: | 2524-4922 2524-4930 |
DOI: | 10.1007/s42514-022-00110-2 |