Online Passive-Aggressive Active Learning for Trapezoidal Data Streams

The idea of combining the active query strategy and the passive-aggressive (PA) update strategy in online learning can be credited to the PA active (PAA) algorithm, which has proven to be effective in learning linear classifiers from datasets with a fixed feature space. We propose a novel family of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2023-10, Vol.PP (10), p.1-15
Hauptverfasser: Liu, Yanfang, Fan, Xiaocong, Li, Wenbin, Gao, Yang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The idea of combining the active query strategy and the passive-aggressive (PA) update strategy in online learning can be credited to the PA active (PAA) algorithm, which has proven to be effective in learning linear classifiers from datasets with a fixed feature space. We propose a novel family of online active learning algorithms, named PAA learning for trapezoidal data streams (PAA _{\text{TS}} ) and multiclass PAA _{\text{TS}} (MPAA _{\text{TS}} ) (and their variants), for binary and multiclass online classification tasks on trapezoidal data streams where the feature space may expand over time. Under the context of an ever-changing feature space, we provide the theoretical analysis of the mistake bounds for both PAA _{\text{TS}} and MPAA _{\text{TS}} . Our experiments on a wide variety of benchmark datasets have confirm that the combination of the instance-regulated active query strategy and the PA update strategy is much more effective in learning from trapezoidal data streams. We have also compared PAA _{\text{TS}} with online learning with streaming features (OL _{\text{SF}} )-the state-of-the-art approach in learning linear classifiers from trapezoidal data streams. PAA _{\text{TS}} could achieve much better classification accuracy, especially for large-scale real-world data streams.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2022.3178880