DISTRIBUTED ARTIFICIAL INTELLIGENCE RUNTIME AT THE NETWORK EDGE AS A SERVICE
This disclosure describes techniques and mechanisms for enabling a user and third party applications to dynamically partition and place heavy deep learning workloads on standard edge networks to optimize the overall inference throughput of the network while meeting Service Level Objective(s) (SLOs)....
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This disclosure describes techniques and mechanisms for enabling a user and third party applications to dynamically partition and place heavy deep learning workloads on standard edge networks to optimize the overall inference throughput of the network while meeting Service Level Objective(s) (SLOs). The techniques may include profiling, partitioning, and splitting of the deep learning workloads, which may be hidden from the user and/or third party application. The user may user interact with a pre-deployed service through a simple SDK that resembles those used for hardware acceleration, such that the current techniques may be easily inserted into their code. |
---|