Tight conditions for when the NTK approximation is valid

We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. 2019, we show that rescaling the model by a factor of \(\alpha = O(T)\) suffices for the NTK approximation to be valid until training time \(T...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-11
Hauptverfasser: Boix-Adsera, Enric, Littwin, Etai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. 2019, we show that rescaling the model by a factor of \(\alpha = O(T)\) suffices for the NTK approximation to be valid until training time \(T\). Our bound is tight and improves on the previous bound of Chizat et al. 2019, which required a larger rescaling factor of \(\alpha = O(T^2)\).
ISSN:2331-8422