Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials

Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kim, Jaesun, Kim, Jisu, Kim, Jaehoon, Lee, Jiho, Park, Yutack, Kang, Youngho, Han, Seungwu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Kim, Jaesun
Kim, Jisu
Kim, Jaehoon
Lee, Jiho
Park, Yutack
Kang, Youngho
Han, Seungwu
description Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multi-fidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. We test this framework on the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results indicate that geometric and compositional spaces not covered by the high-fidelity meta-gradient generalized approximation (meta-GGA) database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also develop a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multi-fidelity learning is more effective than transfer learning or $\Delta$-learning an d that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity dataset.
doi_str_mv 10.48550/arxiv.2409.07947
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_07947</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_07947</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_079473</originalsourceid><addsrcrecordid>eNqFjj0OgkAQRrexMOoBrNwLgKtCkNqfeAB7nMAsTLK7kHE0cnuRmFhaveJ7-fKUWm5MnOzT1KyBX_SMt4nJY5PlSTZVtyMIRGgtlYRBtH84ochShY6k18JAgUKtbcu6obr5TR7KhgJqh8CjQkGQQVpPpe5aGd4I3H2uJnYALr6cqdX5dD1corGl6Jg8cF98moqxafffeAMBK0Pp</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials</title><source>arXiv.org</source><creator>Kim, Jaesun ; Kim, Jisu ; Kim, Jaehoon ; Lee, Jiho ; Park, Yutack ; Kang, Youngho ; Han, Seungwu</creator><creatorcontrib>Kim, Jaesun ; Kim, Jisu ; Kim, Jaehoon ; Lee, Jiho ; Park, Yutack ; Kang, Youngho ; Han, Seungwu</creatorcontrib><description>Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multi-fidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. We test this framework on the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results indicate that geometric and compositional spaces not covered by the high-fidelity meta-gradient generalized approximation (meta-GGA) database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also develop a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multi-fidelity learning is more effective than transfer learning or $\Delta$-learning an d that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity dataset.</description><identifier>DOI: 10.48550/arxiv.2409.07947</identifier><language>eng</language><subject>Physics - Materials Science</subject><creationdate>2024-09</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.07947$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.07947$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kim, Jaesun</creatorcontrib><creatorcontrib>Kim, Jisu</creatorcontrib><creatorcontrib>Kim, Jaehoon</creatorcontrib><creatorcontrib>Lee, Jiho</creatorcontrib><creatorcontrib>Park, Yutack</creatorcontrib><creatorcontrib>Kang, Youngho</creatorcontrib><creatorcontrib>Han, Seungwu</creatorcontrib><title>Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials</title><description>Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multi-fidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. We test this framework on the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results indicate that geometric and compositional spaces not covered by the high-fidelity meta-gradient generalized approximation (meta-GGA) database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also develop a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multi-fidelity learning is more effective than transfer learning or $\Delta$-learning an d that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity dataset.</description><subject>Physics - Materials Science</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjj0OgkAQRrexMOoBrNwLgKtCkNqfeAB7nMAsTLK7kHE0cnuRmFhaveJ7-fKUWm5MnOzT1KyBX_SMt4nJY5PlSTZVtyMIRGgtlYRBtH84ochShY6k18JAgUKtbcu6obr5TR7KhgJqh8CjQkGQQVpPpe5aGd4I3H2uJnYALr6cqdX5dD1corGl6Jg8cF98moqxafffeAMBK0Pp</recordid><startdate>20240912</startdate><enddate>20240912</enddate><creator>Kim, Jaesun</creator><creator>Kim, Jisu</creator><creator>Kim, Jaehoon</creator><creator>Lee, Jiho</creator><creator>Park, Yutack</creator><creator>Kang, Youngho</creator><creator>Han, Seungwu</creator><scope>GOX</scope></search><sort><creationdate>20240912</creationdate><title>Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials</title><author>Kim, Jaesun ; Kim, Jisu ; Kim, Jaehoon ; Lee, Jiho ; Park, Yutack ; Kang, Youngho ; Han, Seungwu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_079473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Physics - Materials Science</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Jaesun</creatorcontrib><creatorcontrib>Kim, Jisu</creatorcontrib><creatorcontrib>Kim, Jaehoon</creatorcontrib><creatorcontrib>Lee, Jiho</creatorcontrib><creatorcontrib>Park, Yutack</creatorcontrib><creatorcontrib>Kang, Youngho</creatorcontrib><creatorcontrib>Han, Seungwu</creatorcontrib><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Jaesun</au><au>Kim, Jisu</au><au>Kim, Jaehoon</au><au>Lee, Jiho</au><au>Park, Yutack</au><au>Kang, Youngho</au><au>Han, Seungwu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials</atitle><date>2024-09-12</date><risdate>2024</risdate><abstract>Machine learning interatomic potentials (MLIPs) are used to estimate potential energy surfaces (PES) from ab initio calculations, providing near quantum-level accuracy with reduced computational costs. However, the high cost of assembling high-fidelity databases hampers the application of MLIPs to systems that require high chemical accuracy. Utilizing an equivariant graph neural network, we present an MLIP framework that trains on multi-fidelity databases simultaneously. This approach enables the accurate learning of high-fidelity PES with minimal high-fidelity data. We test this framework on the Li$_6$PS$_5$Cl and In$_x$Ga$_{1-x}$N systems. The computational results indicate that geometric and compositional spaces not covered by the high-fidelity meta-gradient generalized approximation (meta-GGA) database can be effectively inferred from low-fidelity GGA data, thus enhancing accuracy and molecular dynamics stability. We also develop a general-purpose MLIP that utilizes both GGA and meta-GGA data from the Materials Project, significantly enhancing MLIP performance for high-accuracy tasks such as predicting energies above hull for crystals in general. Furthermore, we demonstrate that the present multi-fidelity learning is more effective than transfer learning or $\Delta$-learning an d that it can also be applied to learn higher-fidelity up to the coupled-cluster level. We believe this methodology holds promise for creating highly accurate bespoke or universal MLIPs by effectively expanding the high-fidelity dataset.</abstract><doi>10.48550/arxiv.2409.07947</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2409.07947
ispartof
issn
language eng
recordid cdi_arxiv_primary_2409_07947
source arXiv.org
subjects Physics - Materials Science
title Data-efficient multi-fidelity training for high-fidelity machine learning interatomic potentials
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T18%3A30%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data-efficient%20multi-fidelity%20training%20for%20high-fidelity%20machine%20learning%20interatomic%20potentials&rft.au=Kim,%20Jaesun&rft.date=2024-09-12&rft_id=info:doi/10.48550/arxiv.2409.07947&rft_dat=%3Carxiv_GOX%3E2409_07947%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true