Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials

Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate (i) the sensitivity to perturbations and (ii) the effective dimensionality of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of chemical physics 2020-10, Vol.153 (14), p.144106-144106
Hauptverfasser: Onat, Berk, Ortner, Christoph, Kermode, James R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 144106
container_issue 14
container_start_page 144106
container_title The Journal of chemical physics
container_volume 153
creator Onat, Berk
Ortner, Christoph
Kermode, James R.
description Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations and over a range of material datasets. Representations investigated include atom centered symmetry functions, Chebyshev Polynomial Symmetry Functions (CHSF), smooth overlap of atomic positions, many-body tensor representation, and atomic cluster expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations and that for CHSF, there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision and, further, that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.
doi_str_mv 10.1063/5.0016005
format Article
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_5_0016005</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2450664880</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-88a9df2315925e8f1598fa9747c2256b14e6bc5d3e6261a07994143d018600743</originalsourceid><addsrcrecordid>eNp90U1LxDAQBuAgCq6rB_9BwIsKXSdpkyZHWfwCwYN6Ltk21SxtUpN0Yf-9WXdRUPA0YebJe5hB6JTAjADPr9gMgHAAtocmBITMSi5hH00AKMkkB36IjkJYQlIlLSZofNY2mGhWJq6xsg1uTL_pOKu6Tcu1WEXXmxpruzLe2TSN2OvB65BeKiYZ8Bh0g1vnca_qd2M17rTy1tg3bGzUfpcwuJi-GNWFY3TQpqJPdnWKXm9vXub32ePT3cP8-jGrc8liJoSSTUtzwiRlWrSpilbJsihrShlfkELzRc2aXHPKiYJSyoIUeQNEpBWURT5F59vcwbuPUYdY9SbUuuuU1W4MFS1YzoWghCZ69osu3ejTFr4UcF4IAUldbFXtXQhet9XgTa_8uiJQbQ5QsWp3gGQvtzbUZruob7xy_gdWQ9P-h_8mfwIKI5Tn</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2450664880</pqid></control><display><type>article</type><title>Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials</title><source>AIP Journals Complete</source><source>Alma/SFX Local Collection</source><creator>Onat, Berk ; Ortner, Christoph ; Kermode, James R.</creator><creatorcontrib>Onat, Berk ; Ortner, Christoph ; Kermode, James R.</creatorcontrib><description>Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations and over a range of material datasets. Representations investigated include atom centered symmetry functions, Chebyshev Polynomial Symmetry Functions (CHSF), smooth overlap of atomic positions, many-body tensor representation, and atomic cluster expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations and that for CHSF, there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision and, further, that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.</description><identifier>ISSN: 0021-9606</identifier><identifier>EISSN: 1089-7690</identifier><identifier>DOI: 10.1063/5.0016005</identifier><identifier>CODEN: JCPSA6</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Chebyshev approximation ; Datasets ; Functions (mathematics) ; Machine learning ; Model accuracy ; Perturbation ; Polynomials ; Regression models ; Representations ; Sensitivity ; Symmetry ; Tensors</subject><ispartof>The Journal of chemical physics, 2020-10, Vol.153 (14), p.144106-144106</ispartof><rights>Author(s)</rights><rights>2020 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-88a9df2315925e8f1598fa9747c2256b14e6bc5d3e6261a07994143d018600743</citedby><cites>FETCH-LOGICAL-c395t-88a9df2315925e8f1598fa9747c2256b14e6bc5d3e6261a07994143d018600743</cites><orcidid>0000-0001-6755-6271 ; 0000-0002-5580-1978</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/jcp/article-lookup/doi/10.1063/5.0016005$$EHTML$$P50$$Gscitation$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,790,4498,27901,27902,76127</link.rule.ids></links><search><creatorcontrib>Onat, Berk</creatorcontrib><creatorcontrib>Ortner, Christoph</creatorcontrib><creatorcontrib>Kermode, James R.</creatorcontrib><title>Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials</title><title>The Journal of chemical physics</title><description>Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations and over a range of material datasets. Representations investigated include atom centered symmetry functions, Chebyshev Polynomial Symmetry Functions (CHSF), smooth overlap of atomic positions, many-body tensor representation, and atomic cluster expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations and that for CHSF, there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision and, further, that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.</description><subject>Chebyshev approximation</subject><subject>Datasets</subject><subject>Functions (mathematics)</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Perturbation</subject><subject>Polynomials</subject><subject>Regression models</subject><subject>Representations</subject><subject>Sensitivity</subject><subject>Symmetry</subject><subject>Tensors</subject><issn>0021-9606</issn><issn>1089-7690</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp90U1LxDAQBuAgCq6rB_9BwIsKXSdpkyZHWfwCwYN6Ltk21SxtUpN0Yf-9WXdRUPA0YebJe5hB6JTAjADPr9gMgHAAtocmBITMSi5hH00AKMkkB36IjkJYQlIlLSZofNY2mGhWJq6xsg1uTL_pOKu6Tcu1WEXXmxpruzLe2TSN2OvB65BeKiYZ8Bh0g1vnca_qd2M17rTy1tg3bGzUfpcwuJi-GNWFY3TQpqJPdnWKXm9vXub32ePT3cP8-jGrc8liJoSSTUtzwiRlWrSpilbJsihrShlfkELzRc2aXHPKiYJSyoIUeQNEpBWURT5F59vcwbuPUYdY9SbUuuuU1W4MFS1YzoWghCZ69osu3ejTFr4UcF4IAUldbFXtXQhet9XgTa_8uiJQbQ5QsWp3gGQvtzbUZruob7xy_gdWQ9P-h_8mfwIKI5Tn</recordid><startdate>20201014</startdate><enddate>20201014</enddate><creator>Onat, Berk</creator><creator>Ortner, Christoph</creator><creator>Kermode, James R.</creator><general>American Institute of Physics</general><scope>AJDQP</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-6755-6271</orcidid><orcidid>https://orcid.org/0000-0002-5580-1978</orcidid></search><sort><creationdate>20201014</creationdate><title>Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials</title><author>Onat, Berk ; Ortner, Christoph ; Kermode, James R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-88a9df2315925e8f1598fa9747c2256b14e6bc5d3e6261a07994143d018600743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Chebyshev approximation</topic><topic>Datasets</topic><topic>Functions (mathematics)</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Perturbation</topic><topic>Polynomials</topic><topic>Regression models</topic><topic>Representations</topic><topic>Sensitivity</topic><topic>Symmetry</topic><topic>Tensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Onat, Berk</creatorcontrib><creatorcontrib>Ortner, Christoph</creatorcontrib><creatorcontrib>Kermode, James R.</creatorcontrib><collection>AIP Open Access Journals</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>MEDLINE - Academic</collection><jtitle>The Journal of chemical physics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Onat, Berk</au><au>Ortner, Christoph</au><au>Kermode, James R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials</atitle><jtitle>The Journal of chemical physics</jtitle><date>2020-10-14</date><risdate>2020</risdate><volume>153</volume><issue>14</issue><spage>144106</spage><epage>144106</epage><pages>144106-144106</pages><issn>0021-9606</issn><eissn>1089-7690</eissn><coden>JCPSA6</coden><abstract>Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations and over a range of material datasets. Representations investigated include atom centered symmetry functions, Chebyshev Polynomial Symmetry Functions (CHSF), smooth overlap of atomic positions, many-body tensor representation, and atomic cluster expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations and that for CHSF, there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision and, further, that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0016005</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0001-6755-6271</orcidid><orcidid>https://orcid.org/0000-0002-5580-1978</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0021-9606
ispartof The Journal of chemical physics, 2020-10, Vol.153 (14), p.144106-144106
issn 0021-9606
1089-7690
language eng
recordid cdi_scitation_primary_10_1063_5_0016005
source AIP Journals Complete; Alma/SFX Local Collection
subjects Chebyshev approximation
Datasets
Functions (mathematics)
Machine learning
Model accuracy
Perturbation
Polynomials
Regression models
Representations
Sensitivity
Symmetry
Tensors
title Sensitivity and dimensionality of atomic environment representations used for machine learning interatomic potentials
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T17%3A17%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sensitivity%20and%20dimensionality%20of%20atomic%20environment%20representations%20used%20for%20machine%20learning%20interatomic%20potentials&rft.jtitle=The%20Journal%20of%20chemical%20physics&rft.au=Onat,%20Berk&rft.date=2020-10-14&rft.volume=153&rft.issue=14&rft.spage=144106&rft.epage=144106&rft.pages=144106-144106&rft.issn=0021-9606&rft.eissn=1089-7690&rft.coden=JCPSA6&rft_id=info:doi/10.1063/5.0016005&rft_dat=%3Cproquest_scita%3E2450664880%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2450664880&rft_id=info:pmid/&rfr_iscdi=true