Application research of credit fraud detection based on distributed rotation deep forest

Credit fraud is a common financial crime that causes significant economic losses to financial institutions. To address this issue, researchers have proposed various fraud detection methods. Recently, research on deep forests has opened up a new path for exploring deep models beyond neural networks....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Intelligent data analysis 2024-01, Vol.28 (4), p.1067-1091
Hauptverfasser: Chen, Hongwei, Shi, Dewei, Zhou, Xun, Zhang, Man, Liu, Luanxuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Credit fraud is a common financial crime that causes significant economic losses to financial institutions. To address this issue, researchers have proposed various fraud detection methods. Recently, research on deep forests has opened up a new path for exploring deep models beyond neural networks. It combines the features of neural networks and ensemble learning, and has achieved good results in various fields. This paper mainly studies the application of deep forests to the field of fraud detection and proposes a distributed dense rotation deep forest algorithm (DRDF-spark) based on the improved RotBoost. The model has three main characteristics: firstly, it solves the problem of multi-granularity scanning due to the lack of spatial correlation in the data by introducing RotBoost. Secondly, Spark is used for parallel construction to improve the processing speed and efficiency of data. Thirdly, a pre-aggregation mechanism is added to the distributed algorithm to locally aggregate the statistical results of sub-forests in the same node in advance to improve communication efficiency. The experiments show that DRDF-spark performs better than deep forests and some mainstream ensemble learning algorithms on the fraud dataset in this paper, and the training speed is up to 3.53 times faster. Furthermore, if the number of nodes is further increased, the speedup ratio will continue to increase.
ISSN:1088-467X
1571-4128
DOI:10.3233/IDA-230193