Parallel matrix multiplication technique optimized for memory fetches
A matrix multiplication circuit comprises a memory storage device, processing circuitry, a parallel multiply circuit, and buffer circuits. The parallel multiply circuit simultaneously performs a count of multiplies in a parallel multiplication operation. The buffer circuits include prefetch buffer c...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A matrix multiplication circuit comprises a memory storage device, processing circuitry, a parallel multiply circuit, and buffer circuits. The parallel multiply circuit simultaneously performs a count of multiplies in a parallel multiplication operation. The buffer circuits include prefetch buffer circuits each having a storage array dimension corresponding to the count of multiplies in the parallel multiplication operation. The processing circuitry loads a first prefetch buffer circuit with values from the first matrix; fetches a value of the second matrix and, in parallel with the fetch, preload the second prefetch buffer circuit with another value from the first matrix; initiates a parallel multiply of the fetched value of the second matrix and the values in the first prefetch buffer circuit; and stores partial product results of the parallel multiply, including adding a current partial product result to a previously stored partial product result. |
---|