Rethinking Reconstruction-based Graph-Level Anomaly Detection: Limitations and a Simple Remedy
Graph autoencoders (Graph-AEs) learn representations of given graphs by aiming to accurately reconstruct them. A notable application of Graph-AEs is graph-level anomaly detection (GLAD), whose objective is to identify graphs with anomalous topological structures and/or node features compared to the...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Graph autoencoders (Graph-AEs) learn representations of given graphs by
aiming to accurately reconstruct them. A notable application of Graph-AEs is
graph-level anomaly detection (GLAD), whose objective is to identify graphs
with anomalous topological structures and/or node features compared to the
majority of the graph population. Graph-AEs for GLAD regard a graph with a high
mean reconstruction error (i.e. mean of errors from all node pairs and/or
nodes) as anomalies. Namely, the methods rest on the assumption that they would
better reconstruct graphs with similar characteristics to the majority. We,
however, report non-trivial counter-examples, a phenomenon we call
reconstruction flip, and highlight the limitations of the existing
Graph-AE-based GLAD methods. Specifically, we empirically and theoretically
investigate when this assumption holds and when it fails. Through our analyses,
we further argue that, while the reconstruction errors for a given graph are
effective features for GLAD, leveraging the multifaceted summaries of the
reconstruction errors, beyond just mean, can further strengthen the features.
Thus, we propose a novel and simple GLAD method, named MUSE. The key innovation
of MUSE involves taking multifaceted summaries of reconstruction errors as
graph features for GLAD. This surprisingly simple method obtains SOTA
performance in GLAD, performing best overall among 14 methods across 10
datasets. |
---|---|
DOI: | 10.48550/arxiv.2410.20366 |