On Key–Value Sort With Active Compute Memory

We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one addressing protocol (perspective) and reading it back with different perspecti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2024-05, Vol.73 (5), p.1341-1356
Hauptverfasser:	Esmaili-Dokht, Pouya, Guiot, Miquel, Radojković, Petar, Martorell, Xavier, Ayguadé, Eduard, Labarta, Jesus, Adlard, Jason, Amato, Paolo, Sforzin, Marco
Format:	Artikel
Sprache:	eng
Schlagworte:	Buffers Computer architecture Network latency Parallel processing Silicon Sorting algorithms
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one addressing protocol (perspective) and reading it back with different perspective. The first perspective is conventional, based on the data address; the second perspective is the sorted order. The ACM requires additional tables to store the meta-data and moderate control logic enhancements that can be implemented directly in the DRAM silicon. By these modest enhancements to DRAM, ACM exploits the parallelism inherently available in the row buffer to enable sort with [Formula Omitted] complexity. This leads to an order of magnitude improvement in ACM performance and energy compared to conventional [Formula Omitted] CPU-centric sort algorithms. The ACM also shows superior performance compared to other near-memory sort accelerators. This is because the ACM processing is done near the row buffer and it exploits much lower memory access latency, higher bandwidth and wider parallel processing. The sort operation covered in this paper is just an example of an address management operation that can be efficiently implemented directly in the DRAM silicon. We release as an open source the simulation infrastructure for the ACM performance and energy modeling. We would encourage the community to use it, adapt it to other PIM proposals, and share their own evaluations.
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2024.3371773