SPLIT NETWORK ACCELERATION ARCHITECTURE
A method for accelerating machine learning on a computing device is described. The method includes hosting a neural network in a first inference accelerator and a second inference accelerator. The neural network split between the first inference accelerator and the second inference accelerator. The...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method for accelerating machine learning on a computing device is described. The method includes hosting a neural network in a first inference accelerator and a second inference accelerator. The neural network split between the first inference accelerator and the second inference accelerator. The method also includes routing intermediate inference request results directly between the first inference accelerator and the second inference accelerator. The method further includes generating a final inference request result from the intermediate inference request results. |
---|