Towards Task and Architecture-Independent Generalization Gap Predictors
Can we use deep learning to predict when deep learning works? Our results suggest the affirmative. We created a dataset by training 13,500 neural networks with different architectures, on different variations of spiral datasets, and using different optimization parameters. We used this dataset to tr...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Can we use deep learning to predict when deep learning works? Our results
suggest the affirmative. We created a dataset by training 13,500 neural
networks with different architectures, on different variations of spiral
datasets, and using different optimization parameters. We used this dataset to
train task-independent and architecture-independent generalization gap
predictors for those neural networks. We extend Jiang et al. (2018) to also use
DNNs and RNNs and show that they outperform the linear model, obtaining
$R^2=0.965$. We also show results for architecture-independent,
task-independent, and out-of-distribution generalization gap prediction tasks.
Both DNNs and RNNs consistently and significantly outperform linear models,
with RNNs obtaining $R^2=0.584$. |
---|---|
DOI: | 10.48550/arxiv.1906.01550 |