SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING

A method for accelerating machine learning on a computing device is described. The method includes accessing a neural network. The method also includes splitting the neural network into N sub-neural networks. The method further includes hosting the N sub-neural networks in M inference accelerators....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BERRY, Geoffrey Carlton, PURANIK, Hemanth, WU, Fan, KILARI, Vijaya Kumar
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A method for accelerating machine learning on a computing device is described. The method includes accessing a neural network. The method also includes splitting the neural network into N sub-neural networks. The method further includes hosting the N sub-neural networks in M inference accelerators. The method also includes scheduling the N sub-neural networks in the M inference accelerators. The method further includes executing the N sub-neural networks in the M inference accelerators.