SPLIT NETWORK ACCELERATION ARCHITECTURE

A method for accelerating machine learning on a computing device is described. The method includes hosting a neural network in a first inference accelerator and a second inference accelerator. The neural network split between the first inference accelerator and the second inference accelerator. The...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ATTAR, Rashid Ahmed Akbar, VERRILLI, Colin Beaton, BHAVANSIKAR, Raghavendar
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A method for accelerating machine learning on a computing device is described. The method includes hosting a neural network in a first inference accelerator and a second inference accelerator. The neural network split between the first inference accelerator and the second inference accelerator. The method also includes routing intermediate inference request results directly between the first inference accelerator and the second inference accelerator. The method further includes generating a final inference request result from the intermediate inference request results.