A case study of streaming storage format for sparse matrices
The Field-Programmable Gate Array is an excellent match for the sparse matrix-vector multiply operation because of its enormous computational capacity and its ability to build a custom memory hierarchy that matches the memory access patterns of the operation. This paper describes a streaming sparse...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Field-Programmable Gate Array is an excellent match for the sparse matrix-vector multiply operation because of its enormous computational capacity and its ability to build a custom memory hierarchy that matches the memory access patterns of the operation. This paper describes a streaming sparse matrix format and a custom memory subsystem that decodes it on-the-fly for multiple compute units. Results show that the proposed design is very efficient and allows the sparse matrix to be streamed sequentially from off-chip RAM. On a Xilinx Virtex-4 device, this results in an average of 58% memory-bandwidth efficiency for one compute unit and 70% memory-bandwidth efficiency for two compute units. The design is capable of achieving an average of 47% of theoretical peak floating-point performance for each compute unit. |
---|---|
ISSN: | 2325-6532 2640-0472 |
DOI: | 10.1109/ReConFig.2012.6416788 |