KERNEL-LEVEL LOAD BALANCING ACROSS NEURAL ENGINES
An electronic device may receive, at a first system routine from a client application, a provisioning request indicating that the application includes code for evaluating a machine learning model, wherein the first system routine executes in user space of memory on the device. The device may provisi...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An electronic device may receive, at a first system routine from a client application, a provisioning request indicating that the application includes code for evaluating a machine learning model, wherein the first system routine executes in user space of memory on the device. The device may provision the code for execution on one or more of the circuit engines. The device may receive, at a second system routine from the application, an inference request for evaluating the machine learning model containing input data, wherein the second system routine executes in kernel space of memory on the device. The device may receive, at the second system routine, information about the circuit engines. The device may assign the inference request to one or more of the circuit engines where the request is evaluated. The device may provide a result of the inference request to the application. |
---|