Operational core and chip based on matrix multiplication combining similar items
The invention relates to an operation core and chip for matrix multiplication based on merged similar items. The operation core mainly comprises a data multiplexing calculation unit, a data mapping unit and a multi-mode accumulator based on an addition tree. In different precision modes, data multip...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to an operation core and chip for matrix multiplication based on merged similar items. The operation core mainly comprises a data multiplexing calculation unit, a data mapping unit and a multi-mode accumulator based on an addition tree. In different precision modes, data multiplexing of different degrees is carried out between the calculation units of the operation core, and the optimal bandwidth utilization rate is guaranteed. The calculation parallelism degree is adjusted according to the precision mode, calculation resources are utilized to the maximum extent, operation modes of multiple precision are supported, and the deep neural network operation requirements of different application scenes are met; moreover, the structure is simple and flexible, the complexity of hardware design is low, and the hardware overhead is low.
本申请涉及一种基于合并同类项的矩阵乘法的运算核及芯片,主要包含数据复用的计算单元、数据映射单元以及基于加法树的多模式累加器。在不同精度模式下,该运算核的计算单元之间进行不同程度的数据复用,保证了最优的带宽利用率;并根据精度模式调整计算并行度,最大限度地利用了计算资源,支持多种精度的运算模式,满足不同应用场景的深度神经网络运算 |
---|