SHINE: A Scalable Heterogeneous Inductive Graph Neural Network for Large Imbalanced Datasets
Research interest in machine learning (ML) for graphs has skyrocketed in recent years. However, non-euclidean graph structures inhibit the application of traditional ML algorithms. Consequently, scholars introduced graph learning algorithms tailored to network data, such as graph neural networks (GN...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2024-09, Vol.36 (9), p.4904-4915 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Research interest in machine learning (ML) for graphs has skyrocketed in recent years. However, non-euclidean graph structures inhibit the application of traditional ML algorithms. Consequently, scholars introduced graph learning algorithms tailored to network data, such as graph neural networks (GNNs). Most GNNs are designed for homogeneous and homophilous graphs and are evaluated on small, static, and balanced datasets, deviating from real-world conditions and industry applications. This paper introduces SHINE, a scalable heterogeneous inductive GNN for large imbalanced datasets. SHINE addresses four key challenges: scalability, network heterogeneity, inductive learning on dynamic graphs, and imbalanced node classification. SHINE comprises three core components: 1) a sampler based on nearest-neighbor (NN) search, 2) a heterogeneous GNN (HGNN) layer with a novel relationship aggregator, and 3) aggregator functions tailored to skewed class distributions. The components of SHINE are evaluated on benchmark datasets, while the integrated benefits of SHINE are demonstrated on two fraud detection datasets. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2024.3381240 |