NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION

A deep neural network ("DNN") module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit (200) can receive an uncompressed chunk of data (202) generated by a neuron in the DNN module. The compression unit gene...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Amol Ashok AMBARDEKAR, Kent D. CEDOLA, Chad Balling MCBRIDE, George PETRE, Larry Marvin WALL, Benjamin Eliot LUNDELL, Joseph Leon CORKERY, Boris BOBROV
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A deep neural network ("DNN") module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit (200) can receive an uncompressed chunk of data (202) generated by a neuron in the DNN module. The compression unit generates a mask portion (208) and a data portion (210) of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit (500) can receive a compressed chunk of data (204) from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion (208) and the data portion (210). This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption. (Figure 4)