Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique
With the economy rapid development, the credit card business enjoys sustained growth, which leads to the frauds happen frequently. Recent years, the intelligence technology has been applied in fraud detection, but they still leave huge potential to improve reliability. Most of the existing researche...
Gespeichert in:
Veröffentlicht in: | Applied soft computing 2024-03, Vol.154, p.111368, Article 111368 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the economy rapid development, the credit card business enjoys sustained growth, which leads to the frauds happen frequently. Recent years, the intelligence technology has been applied in fraud detection, but they still leave huge potential to improve reliability. Most of the existing researches designed the model only related to transaction information; however, the user’s background information and economy status may be helpful to find abnormal behavior. In view of this, we extract valuable features about individual and transaction information, which can reflect personal background and economic status. Meanwhile, in order to solve the problem of fraud detection and imbalanced class, we innovatively construct a fraud detect framework by learning user features and transaction features, which uses a hybrid neural network with a clustering-based undersampling technique on identity and transaction features (HNN-CUHIT). To test the performance of the HNN-CUHIT in credit card fraud detection, we use a real dataset from a city bank during SARS-CoV2 in 2020 to conduct the experiments. In the imbalanced class problem, the experimental result indicates that the ratio of the number of the normal and fraud classes is 1:1 and then the model performance is optimal, while the F1-score is 0.0572 in HNN-CUHIT and is 0.0454 in CNN by ROS. In the fraud detection experiment, the F1-score is 0.0416 in HNN-CUHIT, getting the best performance, while it is 0.0360, 0.0284 and 0.0396 respectively in LR, RF and CNN. According to experimental results, the HNN-CUHIT performs better than other machine learning models in imbalanced class solutions and fraud detection. Our work provides a new approach to detect credit card fraud in the finance field.
•Innovatively construct credit card users’ features on identity background and transaction information.•Propose a hybrid neural network for the fraud detection by identity and transaction features.•Design a clustering-based undersampling method for continuous and discrete variables to solve imbalanced class problem。•Dataset from a Chinese city bank’s real-world credit card datasets in 2020 during SARS-CoV2 is conducted the experiment.•Experimental results indicate our method more efficient. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2024.111368 |