OPTIMAL MULTI-INSTANCE GPU (MIG) AWARE PLACEMENT OF CLIENTS
In one set of embodiments, a computer system can receive a plurality of requests for placing a plurality of clients on a plurality of graphics processing units (GPUs), where each request includes a profile specifying a number of GPU compute slices and a number of GPU memory slices requested by a cor...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In one set of embodiments, a computer system can receive a plurality of requests for placing a plurality of clients on a plurality of graphics processing units (GPUs), where each request includes a profile specifying a number of GPU compute slices and a number of GPU memory slices requested by a corresponding client. The computer system can further formulate an integer linear programming (ILP) problem based on the requests and a maximum number of GPU compute and memory slices supported by each GPU. The computer system can then generate a solution for the ILP problem and place the plurality of clients on the plurality of GPUs in accordance with the solution. |
---|