SPLIT NEURAL NETWORK ACCELERATION ARCHITECTURE SCHEDULING AND DYNAMIC INFERENCE ROUTING

A method for accelerating machine learning on a computing device is described. The method includes accessing a neural network. The method also includes splitting the neural network into N sub-neural networks. The method further includes hosting the N sub-neural networks in M inference accelerators....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	BERRY, Geoffrey Carlton, PURANIK, Hemanth, WU, Fan, KILARI, Vijaya Kumar
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A method for accelerating machine learning on a computing device is described. The method includes accessing a neural network. The method also includes splitting the neural network into N sub-neural networks. The method further includes hosting the N sub-neural networks in M inference accelerators. The method also includes scheduling the N sub-neural networks in the M inference accelerators. The method further includes executing the N sub-neural networks in the M inference accelerators.