EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations
Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs' inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based ind...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Equivariant graph neural networks force fields (EGraFFs) have shown great
promise in modelling complex interactions in atomic systems by exploiting the
graphs' inherent symmetries. Recent works have led to a surge in the
development of novel architectures that incorporate equivariance-based
inductive biases alongside architectural innovations like graph transformers
and message passing to model atomic interactions. However, thorough evaluations
of these deploying EGraFFs for the downstream task of real-world atomistic
simulations, is lacking. To this end, here we perform a systematic benchmarking
of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet),
with the aim of understanding their capabilities and limitations for realistic
atomistic simulations. In addition to our thorough evaluation and analysis on
eight existing datasets based on the benchmarking literature, we release two
new benchmark datasets, propose four new metrics, and three challenging tasks.
The new datasets and tasks evaluate the performance of EGraFF to
out-of-distribution data, in terms of different crystal structures,
temperatures, and new molecules. Interestingly, evaluation of the EGraFF models
based on dynamic simulations reveals that having a lower error on energy or
force does not guarantee stable or reliable simulation or faithful replication
of the atomic structures. Moreover, we find that no model clearly outperforms
other models on all datasets and tasks. Importantly, we show that the
performance of all the models on out-of-distribution datasets is unreliable,
pointing to the need for the development of a foundation model for force fields
that can be used in real-world simulations. In summary, this work establishes a
rigorous framework for evaluating machine learning force fields in the context
of atomic simulations and points to open research challenges within this
domain. |
---|---|
DOI: | 10.48550/arxiv.2310.02428 |