KubeGPU: efficient sharing and isolation mechanisms for GPU resource management in container cloud
With the increasing number of new containerized applications, such as high performance and deep learning applications, started to reply on GPU, efficiently supporting GPU in container cloud becomes essential. While GPU sharing has been extensively studied for VM, limited work has been done for conta...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2023, Vol.79 (1), p.591-625 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the increasing number of new containerized applications, such as high performance and deep learning applications, started to reply on GPU, efficiently supporting GPU in container cloud becomes essential. While GPU sharing has been extensively studied for VM, limited work has been done for containers. Existing works only use a single specific GPU virtualization technique to deploy containers, like GPU pass-through or API forwarding, and lack remote GPU virtualization optimization. The limitations lead to low system throughput and container performance degradation due to the dynamic and heterogeneous nature of container resource requirement and GPU virtualization technique, and the problem of communication overhead and resource racing. Therefore, we designed and implemented KubeGPU, which extends Kubernetes to enable GPU sharing with adaptive share strategy. Adaptive sharing strategy gives KubeGPU the ability to make a dynamic choice of GPU virtualization to deploy containers according to available GPU resources and containers’ configuration parameters such as GPU resource requirement in order to achieve a good container performance and system throughput. Besides that, network-aware scheduling approach and fine-grained allocation of remote GPU resources are proposed to optimize remote GPU virtualization. Finally, using representative real-world workloads for HPC and deep learning, we demonstrate the superiority of KubeGPU compared to other existing works, and the effectiveness of KubeGPU in minimizing communication overhead and eliminating remote GPU resource racing. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-022-04682-2 |