The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
Many of the recent remarkable advances in computer vision and language models can be attributed to the success of transfer learning via the pre-training of large foundation models. However, a theoretical framework which explains this empirical success is incomplete and remains an active area of rese...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Many of the recent remarkable advances in computer vision and language models
can be attributed to the success of transfer learning via the pre-training of
large foundation models. However, a theoretical framework which explains this
empirical success is incomplete and remains an active area of research.
Flatness of the loss surface and neural collapse have recently emerged as
useful pre-training metrics which shed light on the implicit biases underlying
pre-training. In this paper, we explore the geometric complexity of a model's
learned representations as a fundamental mechanism that relates these two
concepts. We show through experiments and theory that mechanisms which affect
the geometric complexity of the pre-trained network also influence the neural
collapse. Furthermore, we show how this effect of the geometric complexity
generalizes to the neural collapse of new classes as well, thus encouraging
better performance on downstream tasks, particularly in the few-shot setting. |
---|---|
DOI: | 10.48550/arxiv.2405.15706 |