Momentum Auxiliary Network for Supervised Local Learning
Deep neural networks conventionally employ end-to-end backpropagation for their training process, which lacks biological credibility and triggers a locking dilemma during network parameter updates, leading to significant GPU memory use. Supervised local learning, which segments the network into mult...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep neural networks conventionally employ end-to-end backpropagation for
their training process, which lacks biological credibility and triggers a
locking dilemma during network parameter updates, leading to significant GPU
memory use. Supervised local learning, which segments the network into multiple
local blocks updated by independent auxiliary networks. However, these methods
cannot replace end-to-end training due to lower accuracy, as gradients only
propagate within their local block, creating a lack of information exchange
between blocks. To address this issue and establish information transfer across
blocks, we propose a Momentum Auxiliary Network (MAN) that establishes a
dynamic interaction mechanism. The MAN leverages an exponential moving average
(EMA) of the parameters from adjacent local blocks to enhance information flow.
This auxiliary network, updated through EMA, helps bridge the informational gap
between blocks. Nevertheless, we observe that directly applying EMA parameters
has certain limitations due to feature discrepancies among local blocks. To
overcome this, we introduce learnable biases, further boosting performance. We
have validated our method on four image classification datasets (CIFAR-10,
STL-10, SVHN, ImageNet), attaining superior performance and substantial memory
savings. Notably, our method can reduce GPU memory usage by more than 45\% on
the ImageNet dataset compared to end-to-end training, while achieving higher
performance. The Momentum Auxiliary Network thus offers a new perspective for
supervised local learning. Our code is available at:
https://github.com/JunhaoSu0/MAN. |
---|---|
DOI: | 10.48550/arxiv.2407.05623 |