Improving equilibrium propagation without weight symmetry through Jacobian homeostasis
Equilibrium propagation (EP) is a compelling alternative to the backpropagation of error algorithm (BP) for computing gradients of neural networks on biological or analog neuromorphic substrates. Still, the algorithm requires weight symmetry and infinitesimal equilibrium perturbations, i.e., nudges,...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Equilibrium propagation (EP) is a compelling alternative to the
backpropagation of error algorithm (BP) for computing gradients of neural
networks on biological or analog neuromorphic substrates. Still, the algorithm
requires weight symmetry and infinitesimal equilibrium perturbations, i.e.,
nudges, to estimate unbiased gradients efficiently. Both requirements are
challenging to implement in physical systems. Yet, whether and how weight
asymmetry affects its applicability is unknown because, in practice, it may be
masked by biases introduced through the finite nudge. To address this question,
we study generalized EP, which can be formulated without weight symmetry, and
analytically isolate the two sources of bias. For complex-differentiable
non-symmetric networks, we show that the finite nudge does not pose a problem,
as exact derivatives can still be estimated via a Cauchy integral. In contrast,
weight asymmetry introduces bias resulting in low task performance due to poor
alignment of EP's neuronal error vectors compared to BP. To mitigate this
issue, we present a new homeostatic objective that directly penalizes
functional asymmetries of the Jacobian at the network's fixed point. This
homeostatic objective dramatically improves the network's ability to solve
complex tasks such as ImageNet 32x32. Our results lay the theoretical
groundwork for studying and mitigating the adverse effects of imperfections of
physical networks on learning algorithms that rely on the substrate's
relaxation dynamics. |
---|---|
DOI: | 10.48550/arxiv.2309.02214 |