Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
Deep graph neural networks (GNNs) have been shown to be expressive for modeling graph-structured data. Nevertheless, the over-stacked architecture of deep graph models makes it difficult to deploy and rapidly test on mobile or embedded systems. To compress over-stacked GNNs, knowledge distillation v...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep graph neural networks (GNNs) have been shown to be expressive for
modeling graph-structured data. Nevertheless, the over-stacked architecture of
deep graph models makes it difficult to deploy and rapidly test on mobile or
embedded systems. To compress over-stacked GNNs, knowledge distillation via a
teacher-student architecture turns out to be an effective technique, where the
key step is to measure the discrepancy between teacher and student networks
with predefined distance functions. However, using the same distance for graphs
of various structures may be unfit, and the optimal distance formulation is
hard to determine. To tackle these problems, we propose a novel Adversarial
Knowledge Distillation framework for graph models named GraphAKD, which
adversarially trains a discriminator and a generator to adaptively detect and
decrease the discrepancy. Specifically, noticing that the well-captured
inter-node and inter-class correlations favor the success of deep GNNs, we
propose to criticize the inherited knowledge from node-level and class-level
views with a trainable discriminator. The discriminator distinguishes between
teacher knowledge and what the student inherits, while the student GNN works as
a generator and aims to fool the discriminator. To our best knowledge, GraphAKD
is the first to introduce adversarial training to knowledge distillation in
graph domains. Experiments on node-level and graph-level classification
benchmarks demonstrate that GraphAKD improves the student performance by a
large margin. The results imply that GraphAKD can precisely transfer knowledge
from a complicated teacher GNN to a compact student GNN. |
---|---|
DOI: | 10.48550/arxiv.2205.11678 |