Improved Aggregating and Accelerating Training Methods for Spatial Graph Neural Networks on Fraud Detection

Graph neural networks (GNNs) have been widely applied to numerous fields. A recent work which combines layered structure and residual connection proposes an improved deep architecture to extend CAmouflage-REsistant GNN (CARE-GNN) to deep models named as Residual Layered CARE-GNN (RLC-GNN), which for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2022-02
Hauptverfasser: Zeng, Yufan, Tang, Jiashan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Graph neural networks (GNNs) have been widely applied to numerous fields. A recent work which combines layered structure and residual connection proposes an improved deep architecture to extend CAmouflage-REsistant GNN (CARE-GNN) to deep models named as Residual Layered CARE-GNN (RLC-GNN), which forms self-correcting and incremental learning mechanism, and achieves significant performance improvements on fraud detection task. However, we spot three issues of RLC-GNN, which are the usage of neighboring information reaching limitation, the training difficulty which is inherent problem to deep models and lack of comprehensive consideration about node features and external patterns. In this work, we propose three approaches to solve those three problems respectively. First, we suggest conducting similarity measure via cosine distance to take both local features and external patterns into consideration. Then, we combine the similarity measure module and the idea of adjacency-wise normalization with node-wise and batch-wise normalization and then propound partial neighborhood normalization methods to overcome the training difficulty while mitigating the impact of too much noise caused by high-density of graph. Finally, we put forward intermediate information supplement to solve the information limitation. Experiments are conducted on Yelp and Amazon datasets. And the results show that our proposed methods effectively solve the three problems. After applying the three methods, we achieve 4.81%, 6.62% and 6.81% improvements in the metrics of recall, AUC and Macro-F1 respectively on the Yelp dataset. And we obtain 1.65% and 0.29% improvements in recall and AUC respectively on the Amazon datasets.
ISSN:2331-8422