Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM
Bitwise operations are an important component of modern day programming. Many widely-used data structures (e.g., bitmap indices in databases) rely on fast bitwise operations on large bit vectors to achieve high performance. Unfortunately, in existing systems, regardless of the underlying architectur...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Bitwise operations are an important component of modern day programming. Many
widely-used data structures (e.g., bitmap indices in databases) rely on fast
bitwise operations on large bit vectors to achieve high performance.
Unfortunately, in existing systems, regardless of the underlying architecture
(e.g., CPU, GPU, FPGA), the throughput of such bulk bitwise operations is
limited by the available memory bandwidth.
We propose Buddy, a new mechanism that exploits the analog operation of DRAM
to perform bulk bitwise operations completely inside the DRAM chip. Buddy
consists of two components. First, simultaneous activation of three DRAM rows
that are connected to the same set of sense amplifiers enables us to perform
bitwise AND and OR operations. Second, the inverters present in each sense
amplifier enables us to perform bitwise NOT operations, with modest changes to
the DRAM array. These two components make Buddy functionally complete. Our
implementation of Buddy largely exploits the existing DRAM structure and
interface, and incurs low overhead (1% of DRAM chip area).
Our evaluations based on SPICE simulations show that, across seven
commonly-used bitwise operations, Buddy provides between 10.9X---25.6X
improvement in raw throughput and 25.1X---59.5X reduction in energy
consumption. We evaluate three real-world data-intensive applications that
exploit bitwise operations: 1) bitmap indices, 2) BitWeaving, and 3)
bitvector-based implementation of sets. Our evaluations show that Buddy
significantly outperforms the state-of-the-art. |
---|---|
DOI: | 10.48550/arxiv.1611.09988 |