DISTRIBUTED ARTIFICIAL INTELLIGENCE RUNTIME AT THE NETWORK EDGE AS A SERVICE

This disclosure describes techniques and mechanisms for enabling a user and third party applications to dynamically partition and place heavy deep learning workloads on standard edge networks to optimize the overall inference throughput of the network while meeting Service Level Objective(s) (SLOs)....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: RYDER, Benjamin William, FELTIN, Thomas Michel-Ange, BROCKNERS, Frank
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This disclosure describes techniques and mechanisms for enabling a user and third party applications to dynamically partition and place heavy deep learning workloads on standard edge networks to optimize the overall inference throughput of the network while meeting Service Level Objective(s) (SLOs). The techniques may include profiling, partitioning, and splitting of the deep learning workloads, which may be hidden from the user and/or third party application. The user may user interact with a pre-deployed service through a simple SDK that resembles those used for hardware acceleration, such that the current techniques may be easily inserted into their code.