IPOCIM: Artificial Intelligent Architecture Design Space Exploration With Scalable Ping-Pong Computing-in-Memory Macro

Computing-in-memory (CIM) architecture has become a possible solution to designing an energy-efficient artificial intelligent processor. Various CIM demonstrators indicated the computing efficiency of CIM macro and CIM-based processors. However, previous studies mainly focus on macro optimization an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on very large scale integration (VLSI) systems 2024-02, Vol.32 (2), p.256-268
Hauptverfasser:	Chang, Liang, Zhao, Xin, Yue, Ting, Yang, Xi, Li, Chenglong, Lin, Shuisheng, Zhou, Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Artificial intelligent processor Computer architecture Computer memory Computing time computing-in-memory (CIM) Energy efficiency Flow mapping In-memory computing Memory architecture Memory management Microprocessors Network latency Neural networks Parameters ping-pong computing Table tennis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Computing-in-memory (CIM) architecture has become a possible solution to designing an energy-efficient artificial intelligent processor. Various CIM demonstrators indicated the computing efficiency of CIM macro and CIM-based processors. However, previous studies mainly focus on macro optimization and low CIM capacity without considering the weight update strategy of CIM architecture. The artificial intelligence (AI) processor with a CIM engine practically induces issues, including updating memory data and supporting different operators. For instance, AI-oriented applications usually contain various weight parameters. The weight stored in the CIM architecture should be reloaded for the considerable gap between the capacity of CIM and growing weight parameters. The computation efficiency of the CIM architecture is reduced by the weight updating and waiting. In addition, the natural parallelism of CIM leads to the mismatch of various convolution kernel sizes in different networks and layers, which reduces hardware utilization efficiency. In this work, we develop a CIM engine with a ping-pong computing strategy as an alternative to typical CIM macro and weight buffer, hiding the data update latency and improving the data reuse ratio. Based on the ping-pong engine, we propose a flexible CIM architecture adapting to different sizes of neural networks, namely, intelligent pong computing-in memory (IPOCIM), with a fine-grained data flow mapping strategy. Based on the evaluation, IPOCIM can achieve a 1.27- 6.27\times performance and 2.34- 5.30\times energy efficiency improvement compared to the state-of-the-art works.
ISSN:	1063-8210 1557-9999
DOI:	10.1109/TVLSI.2023.3330648