Imbalanced Classification via Feature Dictionary-Based Minority Oversampling

Image classification research is one of the fields continuously studied in the computer vision domain, and several related studies have been actively conducted until recently. However, a limit exists regarding the prediction performance of real-world datasets due to the data imbalance problem betwee...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2022, Vol.10, p.34236-34245
Hauptverfasser:	Park, Minho, Song, Hwa Jeon, Kang, Dong-Oh
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Computer vision Datasets Deep learning Dictionaries Feature extraction generative adversarial network Generators Image classification imbalanced classification Oversampling Predictive models Shape Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Image classification research is one of the fields continuously studied in the computer vision domain, and several related studies have been actively conducted until recently. However, a limit exists regarding the prediction performance of real-world datasets due to the data imbalance problem between classes. Data augmentation through artificial sample generation for minority classes is one of the methods used to overcome this limitation. Among the various oversampling methods, we propose the feature dictionary-based generative model for the oversampling method. Feature dictionaries are built through the pretrained feature extractor, and the proposed generative model synthesizes artificial samples based on the dictionary. Class-to-class balanced training can be conducted by fine-tuning the classifier as additional data for the minority class. We experiment by applying the proposed framework to the fashion dataset, which has an extreme class imbalance. The experimental results demonstrate that the proposed model achieved the highest top-1 performance on various public fashion datasets. In addition, we analyze the number of samples in the dictionary and test the effectiveness of the elements that comprise the proposed model using various ablation studies.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3161510