SongC: A Compiler for Hybrid Near-Memory and In-Memory Many-Core Architecture

Building hybrid systems that incorporate various processing-in-memory (PIM) devices and processing-near-memory (PNM) technologies can offer complementary advantages in both efficiency and flexibility, while many-core architectures show great potential in deploying data-centric parallel applications...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers 2024-10, Vol.73 (10), p.2420-2433
Hauptverfasser: Lin, Junfeng, Qu, Huanyu, Ma, Songchen, Ji, Xinglong, Li, Hongyi, Li, Xiaochuan, Song, Chenhang, Zhang, Weihao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Building hybrid systems that incorporate various processing-in-memory (PIM) devices and processing-near-memory (PNM) technologies can offer complementary advantages in both efficiency and flexibility, while many-core architectures show great potential in deploying data-centric parallel applications with high performance. Compilers for the hybrid PN/IM architecture are critical for enabling such computing systems to be put into practical use. However, most of the existing neural network compilers for PIM or PNM are optimized from the perspective of an operator, and cannot effectively take advantage of a decentralized core-level dataflow with large on-chip memory access bandwidth. Here, we propose a full-stack System-on-graph Compiler (SongC) framework for many-core architecture, which optimizes the efficiency of the PIM devices and leverages the flexibility of the PNM architectures. SongC establishes multi-level graph abstractions to clarify the critical deployment challenges at different levels and generalizes the standard optimizations, decoupling versatile algorithms and diverse types of hardware. To handle the complexity of many-core resource utilization, we also establish a simulation-compilation interaction flow, including a just-in-time evaluator to boost the scheduling search and an extended Roofline model, referred to as the Palace model, to guide the search. Experiments demonstrate the various optimizations and overall performance of SongC and reveal the capability of strategy exploration.
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2023.3311948