HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric Model Migration
Distributed training of graph neural networks (GNNs) has become a crucial technique for processing large graphs. Prevalent GNN frameworks are model-centric, necessitating the transfer of massive graph vertex features to GNN models, which leads to a significant communication bottleneck. Recognizing t...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Distributed training of graph neural networks (GNNs) has become a crucial
technique for processing large graphs. Prevalent GNN frameworks are
model-centric, necessitating the transfer of massive graph vertex features to
GNN models, which leads to a significant communication bottleneck. Recognizing
that the model size is often significantly smaller than the feature size, we
propose LeapGNN, a feature-centric framework that reverses this paradigm by
bringing GNN models to vertex features. To make it truly effective, we first
propose a micrograph-based training strategy that trains the model using a
refined structure with superior locality to reduce remote feature retrieval.
Then, we devise a feature pre-gathering approach that merges multiple fetch
operations into a single one to eliminate redundant feature transmissions.
Finally, we employ a micrograph-based merging method that adjusts the number of
micrographs for each worker to minimize kernel switches and synchronization
overhead. Our experimental results demonstrate that LeapGNN achieves a
performance speedup of up to 4.2x compared to the state-of-the-art method,
namely P3. |
---|---|
DOI: | 10.48550/arxiv.2409.00657 |