OPTIMAL MULTI-INSTANCE GPU (MIG) AWARE PLACEMENT OF CLIENTS

In one set of embodiments, a computer system can receive a plurality of requests for placing a plurality of clients on a plurality of graphics processing units (GPUs), where each request includes a profile specifying a number of GPU compute slices and a number of GPU memory slices requested by a cor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kurkure, Uday Pundalik, Sivaraman, Hari, Vu, Lan
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In one set of embodiments, a computer system can receive a plurality of requests for placing a plurality of clients on a plurality of graphics processing units (GPUs), where each request includes a profile specifying a number of GPU compute slices and a number of GPU memory slices requested by a corresponding client. The computer system can further formulate an integer linear programming (ILP) problem based on the requests and a maximum number of GPU compute and memory slices supported by each GPU. The computer system can then generate a solution for the ILP problem and place the plurality of clients on the plurality of GPUs in accordance with the solution.