Evaluation of Convolution Primitives for Embedded Neural Networks on 32-bit Microcontrollers
Deploying neural networks on constrained hardware platforms such as 32-bit microcontrollers is a challenging task because of the large memory, computing and energy requirements of their inference process. To tackle these issues, several convolution primitives have been proposed to make the standard...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deploying neural networks on constrained hardware platforms such as 32-bit
microcontrollers is a challenging task because of the large memory, computing
and energy requirements of their inference process. To tackle these issues,
several convolution primitives have been proposed to make the standard
convolution more computationally efficient. However, few of these primitives
are really implemented for 32-bit microcontrollers. In this work, we collect
different state-of-the-art convolutional primitives and propose an
implementation for ARM Cortex-M processor family with an open source deployment
platform (NNoM). Then, we carry out experimental characterization tests on
these implementations. Our benchmark reveals a linear relationship between
theoretical MACs and energy consumption. Thus showing the advantages of using
computationally efficient primitives like shift convolution. We discuss about
the significant reduction in latency and energy consumption due to the use of
SIMD instructions and highlight the importance of data reuse in those
performance gains. For reproducibility purpose and further experiments, codes
and experiments are publicly available. |
---|---|
DOI: | 10.48550/arxiv.2303.10702 |