Dual Autoencoders Generative Adversarial Network for Imbalanced Classification Problem

The imbalanced classification problem has become greatest issue in many fields, especially in fraud detection. In fraud detection, the transaction datasets available for training are extremely imbalanced, with fraudulent transaction logs considerably less represented. Meanwhile, the feature informat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.91265-91275
Hauptverfasser: Wu, Ensen, Cui, Hongyan, Welsch, Roy E.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The imbalanced classification problem has become greatest issue in many fields, especially in fraud detection. In fraud detection, the transaction datasets available for training are extremely imbalanced, with fraudulent transaction logs considerably less represented. Meanwhile, the feature information of the fraud samples with better classification capabilities cannot be mined directly by feature learning methods due to too few fraud samples. These significantly reduce the effectiveness of fraud detection models. In this paper, we proposed a Dual Autoencoders Generative Adversarial Network, which can balance the majority and minority classes and learn feature representations of normal and fraudulent transactions to improve the accuracy of the fraud detection. The new model firstly trains a Generative Adversarial Networks to output sufficient mimicked fraudulent transactions for autoencoder training. Then, two autoencoders are trained on the normal and fraud dataset, respectively. The samples are encoded by two autoencoders to obtain two sets of features, which are combined to form the dual autoencoding features. Finally, the model detects fraudulent transactions by a Neural Network trained on the augmented training set. Experimental results show that the model outperforms a set of well-known classification methods in experiments, especially the sensitivity and precision, which are effectively improved.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2994327