SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling
In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance criteria measured over the entire dataset during warm-up stages...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we present SwiftLearn, a data-efficient approach to accelerate
training of deep learning models using a subset of data samples selected during
the warm-up stages of training. This subset is selected based on an importance
criteria measured over the entire dataset during warm-up stages, aiming to
preserve the model performance with fewer examples during the rest of training.
The importance measure we propose could be updated during training every once
in a while, to make sure that all of the data samples have a chance to return
to the training loop if they show a higher importance. The model architecture
is unchanged but since the number of data samples controls the number of
forward and backward passes during training, we can reduce the training time by
reducing the number of training samples used in each epoch of training.
Experimental results on a variety of CV and NLP models during both pretraining
and finetuning show that the model performance could be preserved while
achieving a significant speed-up during training. More specifically, BERT
finetuning on GLUE benchmark shows that almost 90% of the data can be dropped
achieving an end-to-end average speedup of 3.36x while keeping the average
accuracy drop less than 0.92%. |
---|---|
DOI: | 10.48550/arxiv.2311.15134 |