TiDAL: Learning Training Dynamics for Active Learning
Active learning (AL) aims to select the most useful data samples from an unlabeled data pool and annotate them to expand the labeled dataset under a limited budget. Especially, uncertainty-based methods choose the most uncertain samples, which are known to be effective in improving model performance...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Active learning (AL) aims to select the most useful data samples from an
unlabeled data pool and annotate them to expand the labeled dataset under a
limited budget. Especially, uncertainty-based methods choose the most uncertain
samples, which are known to be effective in improving model performance.
However, AL literature often overlooks training dynamics (TD), defined as the
ever-changing model behavior during optimization via stochastic gradient
descent, even though other areas of literature have empirically shown that TD
provides important clues for measuring the sample uncertainty. In this paper,
we propose a novel AL method, Training Dynamics for Active Learning (TiDAL),
which leverages the TD to quantify uncertainties of unlabeled data. Since
tracking the TD of all the large-scale unlabeled data is impractical, TiDAL
utilizes an additional prediction module that learns the TD of labeled data. To
further justify the design of TiDAL, we provide theoretical and empirical
evidence to argue the usefulness of leveraging TD for AL. Experimental results
show that our TiDAL achieves better or comparable performance on both balanced
and imbalanced benchmark datasets compared to state-of-the-art AL methods,
which estimate data uncertainty using only static information after model
training. |
---|---|
DOI: | 10.48550/arxiv.2210.06788 |