SHINE: A Scalable Heterogeneous Inductive Graph Neural Network for Large Imbalanced Datasets

Research interest in machine learning (ML) for graphs has skyrocketed in recent years. However, non-euclidean graph structures inhibit the application of traditional ML algorithms. Consequently, scholars introduced graph learning algorithms tailored to network data, such as graph neural networks (GN...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2024-09, Vol.36 (9), p.4904-4915
Hauptverfasser: Van Belle, Rafael, De Weerdt, Jochen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Research interest in machine learning (ML) for graphs has skyrocketed in recent years. However, non-euclidean graph structures inhibit the application of traditional ML algorithms. Consequently, scholars introduced graph learning algorithms tailored to network data, such as graph neural networks (GNNs). Most GNNs are designed for homogeneous and homophilous graphs and are evaluated on small, static, and balanced datasets, deviating from real-world conditions and industry applications. This paper introduces SHINE, a scalable heterogeneous inductive GNN for large imbalanced datasets. SHINE addresses four key challenges: scalability, network heterogeneity, inductive learning on dynamic graphs, and imbalanced node classification. SHINE comprises three core components: 1) a sampler based on nearest-neighbor (NN) search, 2) a heterogeneous GNN (HGNN) layer with a novel relationship aggregator, and 3) aggregator functions tailored to skewed class distributions. The components of SHINE are evaluated on benchmark datasets, while the integrated benefits of SHINE are demonstrated on two fraud detection datasets.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2024.3381240