HALO TRANSFER FOR CONVOLUTION WORKLOAD PARTITION
A DNN accelerator includes multiple compute tiles for sharing a workload of running a convolution. A halo pipeline in a compute tile can facilitate replications of halo data from the compute tile where the halo data is generated into another compute tile. The halo pipeline may receive a memory trans...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A DNN accelerator includes multiple compute tiles for sharing a workload of running a convolution. A halo pipeline in a compute tile can facilitate replications of halo data from the compute tile where the halo data is generated into another compute tile. The halo pipeline may receive a memory transaction for writing a data block. The halo pipeline may determine that the data block falls into a halo region in an input tensor of the convolution. The halo pipeline may generate a remote address for storing the data block in a memory of the other compute tile, e.g., based on a local address of the data block in a memory of the compute tile. The halo pipeline may adjust the remote address, e.g., based on a difference in dimensions of a tensor to be used by the compute tile and a tensor to be used by the other compute tile. |
---|