Provable Guarantees for Neural Networks via Gradient Feature Learning
Neural networks have achieved remarkable empirical performance, while the current theoretical analysis is not adequate for understanding their success, e.g., the Neural Tangent Kernel approach fails to capture their key feature learning ability, while recent analyses on feature learning are typicall...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Neural networks have achieved remarkable empirical performance, while the
current theoretical analysis is not adequate for understanding their success,
e.g., the Neural Tangent Kernel approach fails to capture their key feature
learning ability, while recent analyses on feature learning are typically
problem-specific. This work proposes a unified analysis framework for two-layer
networks trained by gradient descent. The framework is centered around the
principle of feature learning from gradients, and its effectiveness is
demonstrated by applications in several prototypical problems, such as mixtures
of Gaussians and parity functions. The framework also sheds light on
interesting network learning phenomena such as feature learning beyond kernels
and the lottery ticket hypothesis. |
---|---|
DOI: | 10.48550/arxiv.2310.12408 |