Few-Shot Learning by Dimensionality Reduction in Gradient Space
Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:1043-1064 (2022) We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theore...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Proceedings of The 1st Conference on Lifelong Learning Agents,
PMLR 199:1043-1064 (2022) We introduce SubGD, a novel few-shot learning method which is based on the
recent finding that stochastic gradient descent updates tend to live in a
low-dimensional parameter subspace. In experimental and theoretical analyses,
we show that models confined to a suitable predefined subspace generalize well
for few-shot learning. A suitable subspace fulfills three criteria across the
given tasks: it (a) allows to reduce the training error by gradient flow, (b)
leads to models that generalize well, and (c) can be identified by stochastic
gradient descent. SubGD identifies these subspaces from an eigendecomposition
of the auto-correlation matrix of update directions across different tasks.
Demonstrably, we can identify low-dimensional suitable subspaces for few-shot
learning of dynamical systems, which have varying properties described by one
or few parameters of the analytical system description. Such systems are
ubiquitous among real-world applications in science and engineering. We
experimentally corroborate the advantages of SubGD on three distinct dynamical
systems problem settings, significantly outperforming popular few-shot learning
methods both in terms of sample efficiency and performance. |
---|---|
DOI: | 10.48550/arxiv.2206.03483 |