METHODS AND APPARATUS TO PERFORM LOW OVERHEAD SPARSITY ACCELERATION LOGIC FOR MULTI-PRECISION DATAFLOW IN DEEP NEURAL NETWORK ACCELERATORS
Methods, apparatus, systems, and articles of manufacture to perform low overhead sparsity acceleration logic for multi-precision dataflow in deep neural network accelerators are disclosed. An example apparatus includes a first buffer to store data corresponding to a first precision; a second buffer...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Methods, apparatus, systems, and articles of manufacture to perform low overhead sparsity acceleration logic for multi-precision dataflow in deep neural network accelerators are disclosed. An example apparatus includes a first buffer to store data corresponding to a first precision; a second buffer to store data corresponding to a second precision; and hardware control circuitry to: process a first multibit bitmap to determine an activation precision of an activation value, the first multibit bitmap including values corresponding to different precisions; process a second multibit bitmap to determine a weight precision of a weight value, the second multibit bitmap including values corresponding to different precisions; and store the activation value and the weight value in the second buffer when at least one of the activation precision or the weight precision corresponds to the second precision. |
---|