CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures
The demand for efficient machine learning (ML) accelerators is growing rapidly, driving the development of novel computing concepts such as resistive random access memory (RRAM)-based tiled computing-in-memory (CIM) architectures. CIM allows to compute within the memory unit, resulting in faster dat...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The demand for efficient machine learning (ML) accelerators is growing
rapidly, driving the development of novel computing concepts such as resistive
random access memory (RRAM)-based tiled computing-in-memory (CIM)
architectures. CIM allows to compute within the memory unit, resulting in
faster data processing and reduced power consumption. Efficient compiler
algorithms are essential to exploit the potential of tiled CIM architectures.
While conventional ML compilers focus on code generation for CPUs, GPUs, and
other von Neumann architectures, adaptations are needed to cover CIM
architectures. Cross-layer scheduling is a promising approach, as it enhances
the utilization of CIM cores, thereby accelerating computations. Although
similar concepts are implicitly used in previous work, there is a lack of clear
and quantifiable algorithmic definitions for cross-layer scheduling for tiled
CIM architectures. To close this gap, we present CLSA-CIM, a cross-layer
scheduling algorithm for tiled CIM architectures. We integrate CLSA-CIM with
existing weight-mapping strategies and compare performance against
state-of-the-art (SOTA) scheduling algorithms. CLSA-CIM improves the
utilization by up to 17.9 x , resulting in an overall speedup increase of up to
29.2 x compared to SOTA. |
---|---|
DOI: | 10.48550/arxiv.2401.07671 |