Taming Multi-Domain, -Fidelity Data: Towards Foundation Models for Atomistic Scale Simulations
Machine learning interatomic potentials (MLIPs) are changing atomistic simulations in chemistry and materials science. Yet, building a single, universal MLIP -- capable of accurately modeling both molecular and crystalline systems -- remains challenging. A central obstacle lies in integrating the di...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine learning interatomic potentials (MLIPs) are changing atomistic
simulations in chemistry and materials science. Yet, building a single,
universal MLIP -- capable of accurately modeling both molecular and crystalline
systems -- remains challenging. A central obstacle lies in integrating the
diverse datasets generated under different computational conditions. This
difficulty creates an accessibility barrier, allowing only institutions with
substantial computational resources -- those able to perform costly
recalculations to standardize data -- to contribute meaningfully to the
advancement of universal MLIPs. Here, we present Total Energy Alignment (TEA),
an approach that enables the seamless integration of heterogeneous quantum
chemical datasets almost without redundant calculations. Using TEA, we have
trained MACE-Osaka24, the first open-source neural network potential model
based on a unified dataset covering both molecular and crystalline systems,
utilizing the MACE architecture developed by Batatia et al. This universal
model shows strong performance across diverse chemical systems, exhibiting
comparable or improved accuracy in predicting organic reaction barriers
compared to specialized models, while effectively maintaining state-of-the-art
accuracy for inorganic systems. Our method democratizes the development of
universal MLIPs, enabling researchers across academia and industry to
contribute to and benefit from high-accuracy potential energy surface models,
regardless of their computational resources. This advancement paves the way for
accelerated discovery in chemistry and materials science through genuinely
foundation models for chemistry. |
---|---|
DOI: | 10.48550/arxiv.2412.13088 |