Memory bandwidth management for deep learning applications

In a data center, neural network evaluations can be included for services involving image or speech recognition by using a field programmable gate array (FPGA) or other parallel processor. The memory bandwidth limitations of providing weighted data sets from an external memory to the FPGA (or other...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Seide, Frank Torsten Bernd, Bittner, Jr., Ray A
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In a data center, neural network evaluations can be included for services involving image or speech recognition by using a field programmable gate array (FPGA) or other parallel processor. The memory bandwidth limitations of providing weighted data sets from an external memory to the FPGA (or other parallel processor) can be managed by queuing up input data from the plurality of cores executing the services at the FPGA (or other parallel processor) in batches of at least two feature vectors. The at least two feature vectors can be at least two observation vectors from a same data stream or from different data streams. The FPGA (or other parallel processor) can then act on the batch of data for each loading of the weighted datasets.