Broadcasting machine learning data
A processor 800 comprises several processing circuits 810 and storage circuitry, such as a level 2 cache 887. The processor fetches machine learning data from main memory and stores it in the storage circuitry. Data from the storage circuitry is broadcast to multiple processing circuits simultaneous...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A processor 800 comprises several processing circuits 810 and storage circuitry, such as a level 2 cache 887. The processor fetches machine learning data from main memory and stores it in the storage circuitry. Data from the storage circuitry is broadcast to multiple processing circuits simultaneously. The data may be stored in local storage in the processing circuits, such as tile buffers 895. The processor may comprise compression circuitry 815, which decompresses data fetched from main memory and compresses data written to main memory. The processing circuits may be shader cores in a graphics processor. The machine learning data may comprise kernels and feature maps. The processor may include snoop circuitry to maintain the coherency of the broadcast data. The broadcasting of data may be managed by a job manager 835. |
---|