On the Variance of the Fisher Information for Deep Learning
In the realm of deep learning, the Fisher information matrix (FIM) gives novel insights and useful tools to characterize the loss landscape, perform second-order optimization, and build geometric learning theories. The exact FIM is either unavailable in closed form or too expensive to compute. In pr...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the realm of deep learning, the Fisher information matrix (FIM) gives
novel insights and useful tools to characterize the loss landscape, perform
second-order optimization, and build geometric learning theories. The exact FIM
is either unavailable in closed form or too expensive to compute. In practice,
it is almost always estimated based on empirical samples. We investigate two
such estimators based on two equivalent representations of the FIM -- both
unbiased and consistent. Their estimation quality is naturally gauged by their
variance given in closed form. We analyze how the parametric structure of a
deep neural network can affect the variance. The meaning of this variance
measure and its upper bounds are then discussed in the context of deep
learning. |
---|---|
DOI: | 10.48550/arxiv.2107.04205 |