A conditional variational autoencoder based self-transferred algorithm for imbalanced classification

In this paper, we propose a conditional variational autoencoder-based self-transferred (CVAE_SeTred) algorithm to solve the highly imbalanced classification problem, where the training instances of the minority classes are rare. Our method belongs to an over-sampling technique that utilizes variatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2021-04, Vol.218, p.106756, Article 106756
Hauptverfasser: Zhao, Yudi, Hao, Kuangrong, Tang, Xue-song, Chen, Lei, Wei, Bing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we propose a conditional variational autoencoder-based self-transferred (CVAE_SeTred) algorithm to solve the highly imbalanced classification problem, where the training instances of the minority classes are rare. Our method belongs to an over-sampling technique that utilizes variational autoencoders (VAEs) to generate training samples for the minority classes. Traditional over-sampling methods mainly rely on minority classes themselves, our approach exploits the information from both the majority and minority classes and aims to transfer instructional knowledge from the majority classes to the minority classes, where the majority and minority classes are analogized as the self-transferred (SeTred) source and target domain, respectively. Specifically, our model comprises two encoders, one decoder, and one domain classifier and can simultaneously conduct distribution learning, SeTred learning, image generation, and dataset rebalancing in a joint and unified framework. The proposed method can not only learn domain-invariant and multivariate Gaussian distributed latent variables but also generate discriminative samples for the minority class according to designated labels. We verify the effectiveness of the CVAE_SeTred model on both imbalanced datasets constructed from benchmark datasets and a more challenging real-world industrial application, such as imbalanced classification for fabric defects. Experimental results indicate that our method outperforms other comparative methods and can generate samples with better diversity.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2021.106756