TE-Spikformer:Temporal-enhanced spiking neural network with transformer

The integration of Spiking Neural Networks (Maass, 1997) and Transformers (Vaswani et al., 2017) has significantly enhanced performance in the field, achieving substantial improvements while ensuring energy efficiency. This amalgamation bears significant exploratory significance. This paper aims to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2024-10, Vol.602, p.128268, Article 128268
Hauptverfasser: Gao, ShouWei, Fan, XiangYu, Deng, XingYang, Hong, ZiChao, Zhou, Hao, Zhu, ZiHao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The integration of Spiking Neural Networks (Maass, 1997) and Transformers (Vaswani et al., 2017) has significantly enhanced performance in the field, achieving substantial improvements while ensuring energy efficiency. This amalgamation bears significant exploratory significance. This paper aims to enhance the network’s capacity for extracting temporal information from neuromorphic datasets. By leveraging the Spikformer (Zhou et al., 2023b) architecture, we introduce a novel network named TE-Spikformer. Through thorough analysis, we identified an imbalance in the Average Firing Rates(AFR) of neurons in the layer preceding the classification head across temporal steps. We thoroughly investigate the root cause of this issue, attributing it to the limitations of the network’s use of Batch Normalization (BN) (Ioffe and Szegedy, 2015) layer. To address these challenges, we propose a Batch Group Normalization (BGN) layer. While maintaining the stability of temporal characteristics in the data, we also introduce the Spike Spatio-Temporal Attention(SSTA) module to enhance the network’s ability to capture temporal information. To validate the effectiveness of our approach, we conducted multiple experiments using neuromorphic datasets, including DVS-CIFAR10, DVS128 Gesture, and N-Caltech101. The experimental results demonstrate that our algorithm consistently outperforms baseline methods, achieving accuracy rates of 99.30%, 89.60%, and 87.42%, respectively, thereby attaining state-of-the-art performance in the field. •Proposed imbalanced average firing rate across time steps.•Analyzed BN layer inadequacy, proposed BGN layer.•Introduced Spike Spatio-Temporal Attention (SSTA) module.•Achieved SOTA results on three popular neuromorphic datasets.
ISSN:0925-2312
DOI:10.1016/j.neucom.2024.128268