1.63 pJ/SOP Neuromorphic Processor With Integrated Partial Sum Routers for In-Network Computing

Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. H...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on very large scale integration (VLSI) systems 2024-11, Vol.32 (11), p.2085-2092
Hauptverfasser: Li, Dongrui, Wong, Ming Ming, Chong, Yi Sheng, Zhou, Jun, Upadhyay, Mohit, Balaji, Ananta, Mani, Aarthy, Wong, Weng Fai, Peh, Li Shiuan, Do, Anh Tuan, Wang, Bo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Neuromorphic computing is promising to achieve unprecedented energy efficiency by emulating the human brain's mechanism. Conventional neuromorphic accelerators employ split-and-merge method to map spiking neural networks' inputs to surpass the fan-in capabilities of a single neuron core. However, this approach gives rise to the risk of accuracy compromise and extra core usage for the merging process. Moreover, it requires excessive data movement and clock cycles to aggregate spikes generated by partial sums instead of total sums obtained from different cores with substantial power and energy overhead. This work presents a novel approach to addressing the challenges imposed by the split-and-merge method. We propose an energy-efficient, reconfigurable neuromorphic processor that leverages several key techniques to mitigate the above issues. First, we introduce a partial sum router circuitry that enables in-network computing (INC), eliminating the need for extra merge cores. Second, we adopt software-defined Networks-on-Chip (NoCs) by leveraging predefined, efficient routing, eliminating power-hungry routing computation. At last, we incorporate fine-grained power gating and clock gating techniques for further power reduction. Experimental results from our test chip demonstrate the lossless mapping of the algorithm and exceptional energy efficiency, achieving an energy consumption of 1.63 pJ/SOP at 0.48 V. This energy efficiency represents a 22.4% improvement compared to the state-of-the-art results. Our proposed neuromorphic processor provides an efficient and flexible solution for neural network processing, mitigating the limitations of the traditional split-and-merge approach while delivering superior energy efficiency.
ISSN:1063-8210
1557-9999
DOI:10.1109/TVLSI.2024.3409652