Gradient-Free Neural Network Training on the Edge
Training neural networks is computationally heavy and energy-intensive. Many methodologies were developed to save computational requirements and energy by reducing the precision of network weights at inference time and introducing techniques such as rounding, stochastic rounding, and quantization. H...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Training neural networks is computationally heavy and energy-intensive. Many
methodologies were developed to save computational requirements and energy by
reducing the precision of network weights at inference time and introducing
techniques such as rounding, stochastic rounding, and quantization. However,
most of these techniques still require full gradient precision at training
time, which makes training such models prohibitive on edge devices. This work
presents a novel technique for training neural networks without needing
gradients. This enables a training process where all the weights are one or two
bits, without any hidden full precision computations. We show that it is
possible to train models without gradient-based optimization techniques by
identifying erroneous contributions of each neuron towards the expected
classification and flipping the relevant bits using logical operations. We
tested our method on several standard datasets and achieved performance
comparable to corresponding gradient-based baselines with a fraction of the
compute power. |
---|---|
DOI: | 10.48550/arxiv.2410.09734 |