Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies
Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models curren...
Gespeichert in:
Veröffentlicht in: | Chemical science (Cambridge) 2021-01, Vol.12 (3), p.1163-1175 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1175 |
---|---|
container_issue | 3 |
container_start_page | 1163 |
container_title | Chemical science (Cambridge) |
container_volume | 12 |
creator | Jorner, Kjell Brinck, Tore Norrby, Per-Ola Buttar, David |
description | Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol
−1
for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
Hybrid reactivity models, combining mechanistic calculations and machine learning with descriptors, are used to predict barriers for nucleophilic aromatic substitution. |
doi_str_mv | 10.1039/d0sc04896h |
format | Article |
fullrecord | <record><control><sourceid>proquest_swepu</sourceid><recordid>TN_cdi_proquest_miscellaneous_2729530123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2729530123</sourcerecordid><originalsourceid>FETCH-LOGICAL-c507t-7b8d117ecbd33545ab3ffe998ac41f87d4f6511cd52e64dcf9546fe9bf4d87c3</originalsourceid><addsrcrecordid>eNpdks9rFDEUx4Motqy9eFcGvIgwmt8zuQhlq1aoeLB4DZnkZSd1NlmTmar_vWm3rtYcksD38x5f3vch9JTg1wQz9cbhYjHvlRwfoGOKOWmlYOrh4U_xETop5QrXwxgRtHuMjpikSslOHqPxk7FjiNBMYHIMcdNsAeZSbzuaGMocbLNNDqbpRvMpN8baJZsZml0GF-wcUmySb-DnDnLYQpzNVJk5XJtbCSLkTYDyBD3yZipwcveu0OX7d5fr8_bi84eP69OL1grczW039I6QDuzgGBNcmIF5D0r1xnLi-85xLwUh1gkKkjvrleCyAoPnru8sW6F237b8gN0y6F31ZPIvnUzQZ-HrqU55o7_No6aKKMkr_3bPV3gLzlb_2Uz3yu4rMYx6k661ErTvawIr9PKuQU7fFyiz3oZi67xMhLQUTTuqBMOEsoq--A-9SkuOdRqa8p5ITBnuKvVqT9mcSsngD2YI1jeR6zP8ZX0b-XmFn_9r_4D-CbgCz_ZALvag_t0Z9htijrPt</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2481602307</pqid></control><display><type>article</type><title>Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies</title><source>DOAJ Directory of Open Access Journals</source><source>SWEPUB Freely available online</source><source>PubMed Central Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Jorner, Kjell ; Brinck, Tore ; Norrby, Per-Ola ; Buttar, David</creator><creatorcontrib>Jorner, Kjell ; Brinck, Tore ; Norrby, Per-Ola ; Buttar, David</creatorcontrib><description>Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol
−1
for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
Hybrid reactivity models, combining mechanistic calculations and machine learning with descriptors, are used to predict barriers for nucleophilic aromatic substitution.</description><identifier>ISSN: 2041-6520</identifier><identifier>ISSN: 2041-6539</identifier><identifier>EISSN: 2041-6539</identifier><identifier>DOI: 10.1039/d0sc04896h</identifier><identifier>PMID: 36299676</identifier><language>eng</language><publisher>England: Royal Society of Chemistry</publisher><subject>Chemical reactions ; Chemistry ; Density functional theory ; Gaussian process ; Machine learning ; Model testing ; Rate constants ; Regression models ; Risk assessment ; Substitution reactions ; Workflow</subject><ispartof>Chemical science (Cambridge), 2021-01, Vol.12 (3), p.1163-1175</ispartof><rights>This journal is © The Royal Society of Chemistry.</rights><rights>Copyright Royal Society of Chemistry 2021</rights><rights>This journal is © The Royal Society of Chemistry 2021 The Royal Society of Chemistry</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c507t-7b8d117ecbd33545ab3ffe998ac41f87d4f6511cd52e64dcf9546fe9bf4d87c3</citedby><cites>FETCH-LOGICAL-c507t-7b8d117ecbd33545ab3ffe998ac41f87d4f6511cd52e64dcf9546fe9bf4d87c3</cites><orcidid>0000-0002-2419-0705 ; 0000-0003-2673-075X ; 0000-0001-5466-023X ; 0000-0002-4191-6790</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9528810/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9528810/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,552,727,780,784,864,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36299676$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291964$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><creatorcontrib>Jorner, Kjell</creatorcontrib><creatorcontrib>Brinck, Tore</creatorcontrib><creatorcontrib>Norrby, Per-Ola</creatorcontrib><creatorcontrib>Buttar, David</creatorcontrib><title>Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies</title><title>Chemical science (Cambridge)</title><addtitle>Chem Sci</addtitle><description>Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol
−1
for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
Hybrid reactivity models, combining mechanistic calculations and machine learning with descriptors, are used to predict barriers for nucleophilic aromatic substitution.</description><subject>Chemical reactions</subject><subject>Chemistry</subject><subject>Density functional theory</subject><subject>Gaussian process</subject><subject>Machine learning</subject><subject>Model testing</subject><subject>Rate constants</subject><subject>Regression models</subject><subject>Risk assessment</subject><subject>Substitution reactions</subject><subject>Workflow</subject><issn>2041-6520</issn><issn>2041-6539</issn><issn>2041-6539</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>D8T</sourceid><recordid>eNpdks9rFDEUx4Motqy9eFcGvIgwmt8zuQhlq1aoeLB4DZnkZSd1NlmTmar_vWm3rtYcksD38x5f3vch9JTg1wQz9cbhYjHvlRwfoGOKOWmlYOrh4U_xETop5QrXwxgRtHuMjpikSslOHqPxk7FjiNBMYHIMcdNsAeZSbzuaGMocbLNNDqbpRvMpN8baJZsZml0GF-wcUmySb-DnDnLYQpzNVJk5XJtbCSLkTYDyBD3yZipwcveu0OX7d5fr8_bi84eP69OL1grczW039I6QDuzgGBNcmIF5D0r1xnLi-85xLwUh1gkKkjvrleCyAoPnru8sW6F237b8gN0y6F31ZPIvnUzQZ-HrqU55o7_No6aKKMkr_3bPV3gLzlb_2Uz3yu4rMYx6k661ErTvawIr9PKuQU7fFyiz3oZi67xMhLQUTTuqBMOEsoq--A-9SkuOdRqa8p5ITBnuKvVqT9mcSsngD2YI1jeR6zP8ZX0b-XmFn_9r_4D-CbgCz_ZALvag_t0Z9htijrPt</recordid><startdate>20210121</startdate><enddate>20210121</enddate><creator>Jorner, Kjell</creator><creator>Brinck, Tore</creator><creator>Norrby, Per-Ola</creator><creator>Buttar, David</creator><general>Royal Society of Chemistry</general><general>The Royal Society of Chemistry</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>7X8</scope><scope>5PM</scope><scope>ADTPV</scope><scope>AFDQA</scope><scope>AOWAS</scope><scope>D8T</scope><scope>D8V</scope><scope>ZZAVC</scope><orcidid>https://orcid.org/0000-0002-2419-0705</orcidid><orcidid>https://orcid.org/0000-0003-2673-075X</orcidid><orcidid>https://orcid.org/0000-0001-5466-023X</orcidid><orcidid>https://orcid.org/0000-0002-4191-6790</orcidid></search><sort><creationdate>20210121</creationdate><title>Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies</title><author>Jorner, Kjell ; Brinck, Tore ; Norrby, Per-Ola ; Buttar, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c507t-7b8d117ecbd33545ab3ffe998ac41f87d4f6511cd52e64dcf9546fe9bf4d87c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Chemical reactions</topic><topic>Chemistry</topic><topic>Density functional theory</topic><topic>Gaussian process</topic><topic>Machine learning</topic><topic>Model testing</topic><topic>Rate constants</topic><topic>Regression models</topic><topic>Risk assessment</topic><topic>Substitution reactions</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jorner, Kjell</creatorcontrib><creatorcontrib>Brinck, Tore</creatorcontrib><creatorcontrib>Norrby, Per-Ola</creatorcontrib><creatorcontrib>Buttar, David</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>SwePub</collection><collection>SWEPUB Kungliga Tekniska Högskolan full text</collection><collection>SwePub Articles</collection><collection>SWEPUB Freely available online</collection><collection>SWEPUB Kungliga Tekniska Högskolan</collection><collection>SwePub Articles full text</collection><jtitle>Chemical science (Cambridge)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jorner, Kjell</au><au>Brinck, Tore</au><au>Norrby, Per-Ola</au><au>Buttar, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies</atitle><jtitle>Chemical science (Cambridge)</jtitle><addtitle>Chem Sci</addtitle><date>2021-01-21</date><risdate>2021</risdate><volume>12</volume><issue>3</issue><spage>1163</spage><epage>1175</epage><pages>1163-1175</pages><issn>2041-6520</issn><issn>2041-6539</issn><eissn>2041-6539</eissn><abstract>Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol
−1
for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
Hybrid reactivity models, combining mechanistic calculations and machine learning with descriptors, are used to predict barriers for nucleophilic aromatic substitution.</abstract><cop>England</cop><pub>Royal Society of Chemistry</pub><pmid>36299676</pmid><doi>10.1039/d0sc04896h</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-2419-0705</orcidid><orcidid>https://orcid.org/0000-0003-2673-075X</orcidid><orcidid>https://orcid.org/0000-0001-5466-023X</orcidid><orcidid>https://orcid.org/0000-0002-4191-6790</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2041-6520 |
ispartof | Chemical science (Cambridge), 2021-01, Vol.12 (3), p.1163-1175 |
issn | 2041-6520 2041-6539 2041-6539 |
language | eng |
recordid | cdi_proquest_miscellaneous_2729530123 |
source | DOAJ Directory of Open Access Journals; SWEPUB Freely available online; PubMed Central Open Access; EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Chemical reactions Chemistry Density functional theory Gaussian process Machine learning Model testing Rate constants Regression models Risk assessment Substitution reactions Workflow |
title | Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T08%3A06%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_swepu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20learning%20meets%20mechanistic%20modelling%20for%20accurate%20prediction%20of%20experimental%20activation%20energies&rft.jtitle=Chemical%20science%20(Cambridge)&rft.au=Jorner,%20Kjell&rft.date=2021-01-21&rft.volume=12&rft.issue=3&rft.spage=1163&rft.epage=1175&rft.pages=1163-1175&rft.issn=2041-6520&rft.eissn=2041-6539&rft_id=info:doi/10.1039/d0sc04896h&rft_dat=%3Cproquest_swepu%3E2729530123%3C/proquest_swepu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2481602307&rft_id=info:pmid/36299676&rfr_iscdi=true |