Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions
p>We study the dynamics of optimization and the generalization properties of one-hidden layer neural networks with quadratic activation function in the overparametrized regime where the layer width m is larger than the input dimension d. We consider a teacher-student scenario where the teacher ha...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Web Resource |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | p>We study the dynamics of optimization and the generalization properties of one-hidden layer neural networks with quadratic activation function in the overparametrized regime where the layer width m is larger than the input dimension d.
We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width m* |
---|