On the computation of the gradient in implicit neural networks

Implicit neural networks and the related deep equilibrium models are investigated. To train these networks, the gradient of the corresponding loss function should be computed. Bypassing the implicit function theorem, we develop an explicit representation of this quantity, which leads to an easily ac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing 2024, Vol.80 (12), p.17247-17268
Hauptverfasser:	Szekeres, Béla J., Izsák, Ferenc
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Compilers Computer Science Interpreters Neural networks Processor Architectures Programming Languages
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Implicit neural networks and the related deep equilibrium models are investigated. To train these networks, the gradient of the corresponding loss function should be computed. Bypassing the implicit function theorem, we develop an explicit representation of this quantity, which leads to an easily accessible computational algorithm. The theoretical findings are also supported by numerical simulations.
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-024-06117-6