Memory bandwidth management for deep learning applications

In a data center, neural network evaluations can be included for services involving image or speech recognition by using a field programmable gate array (FPGA) or other parallel processor. The memory bandwidth limitations of providing weighted data sets from an external memory to the FPGA (or other...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BITTNER RAY A JR, SEIDE FRANK TORSTEN BERND
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In a data center, neural network evaluations can be included for services involving image or speech recognition by using a field programmable gate array (FPGA) or other parallel processor. The memory bandwidth limitations of providing weighted data sets from an external memory to the FPGA (or other parallel processor) can be managed by queuing up input data from the plurality of cores executing the services at the FPGA (or other parallel processor) in batches of at least two feature vectors. The at least two feature vectors can be at least two observation vectors from a same data stream or from different data streams. The FPGA (or other parallel processor) can then act on the batch of data for each loading of the weighted datasets. 在数据中心中,通过使用现场可编程门阵列(FPGA)或其他并行处理器,神经网络评估可以被包括以用于涉及图像或语音识别的服务。将来自外部存储器的加权的数据集提供给FPGA(或其他并行处理器)的存储器带宽限制可以通过以至少两个特征向量的批次对来自在FPGA(或其他并行处理器)处执行服务的多个核心的输入数据排队来管理。至少两个特征向量可以是来自相同数据流或来自不同数据流的至少两个观测向量。FPGA(或其他并行处理器)继而可以针对加权的数据集的每次加载对一批数据采取动作。