ONNXim: A Fast, Cycle-Level Multi-Core NPU Simulator
As DNNs (Deep Neural Networks) demand increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) has become more important. However, existing architectural NPU simulators lack support for high-speed simulation, multi-core modeling, multi-te...
Gespeichert in:
Veröffentlicht in: | IEEE computer architecture letters 2024-07, Vol.23 (2), p.219-222 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As DNNs (Deep Neural Networks) demand increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) has become more important. However, existing architectural NPU simulators lack support for high-speed simulation, multi-core modeling, multi-tenant scenarios, detailed DRAM/NoC modeling, and/or different deep learning frameworks. To address these limitations, this work proposes ONNXim , a fast cycle-level simulator for multi-core NPUs in DNN serving systems. For ease of simulation, it takes DNN models in the ONNX graph format generated from various deep learning frameworks. In addition, based on the observation that typical NPU cores process tensor tiles from SRAM with deterministic compute latency, we model computation accurately with an event-driven approach, avoiding the overhead of modeling cycle-level activities. ONNXim also preserves dependencies between compute and tile DMAs. Meanwhile, the DRAM and NoC are modeled in cycle-level to properly model contention among multiple cores that can execute different DNN models for multi-tenancy. Consequently, ONNXim is significantly faster than existing simulators (e.g., by up to 365× over Accel-sim) and enables various case studies, such as multi-tenant NPUs, that were previously impractical due to slow speed and/or lack of functionalities. |
---|---|
ISSN: | 1556-6056 1556-6064 |
DOI: | 10.1109/LCA.2024.3484648 |