ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning
Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine...
Gespeichert in:
Veröffentlicht in: | Journal of chemical information and modeling 2012-10, Vol.52 (10), p.2526-2540 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2540 |
---|---|
container_issue | 10 |
container_start_page | 2526 |
container_title | Journal of chemical information and modeling |
container_volume | 52 |
creator | Kayala, Matthew A Baldi, Pierre |
description | Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to >93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with >99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/. |
doi_str_mv | 10.1021/ci3003039 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1114698211</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2796899951</sourcerecordid><originalsourceid>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</originalsourceid><addsrcrecordid>eNplkdtKAzEQhoMotlYvfAEJiKAX1WSzm028k-IJWhSx4N2SZic2dQ812Yq-vak9WBQGZvj55p9hBqFDSs4pieiFtowQRpjcQm2axLIrOXnZXtWJ5C205_0kMEzyaBe1okimgjPZRm9PoHRj6-rRQW51U7tLvCyDiGuDe3U5LeAT98ZQWq0KvOrwWDW4GQMegB6ryvrGatyHDyjw0NvqFQ-UHtsKgqZcFYR9tGNU4eFgmTtoeHP93Lvr9h9u73tX_a6KU950FRAjVCKThEghYz4yoGOuKUhBeASxFJTlXIwMjTkDExEapYxQnuc6YcKkrINOF75TV7_PwDdZab2GolAV1DOfURo6pYgoDejxH3RSz1wVtvuhRIh4bni2oLSrvXdgsqmzpXJfGSXZ_APZ-gOBPVo6zkYl5GtydfIAnCwB5cM5jVOVtv6X4wnlNN3glPYbW_0b-A0g6JfN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1114814847</pqid></control><display><type>article</type><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><source>MEDLINE</source><source>ACS Publications</source><creator>Kayala, Matthew A ; Baldi, Pierre</creator><creatorcontrib>Kayala, Matthew A ; Baldi, Pierre</creatorcontrib><description>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to >93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with >99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/ci3003039</identifier><identifier>PMID: 22978639</identifier><language>eng</language><publisher>Washington, DC: American Chemical Society</publisher><subject>Algorithms ; Applied sciences ; Artificial Intelligence ; Chemical reactions ; Chemistry ; Chemistry, Pharmaceutical ; Computer science; control theory; systems ; Computer Simulation ; Data processing. List processing. Character string processing ; Drug Design ; Exact sciences and technology ; Free Radicals - chemistry ; General and physical chemistry ; General. Nomenclature, chemical documentation, computer chemistry ; Hydrophobic and Hydrophilic Interactions ; Informatics ; Internet ; Memory organisation. Data processing ; Models, Chemical ; Molecules ; Organic chemistry ; Organic Chemistry Phenomena ; Software ; Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</subject><ispartof>Journal of chemical information and modeling, 2012-10, Vol.52 (10), p.2526-2540</ispartof><rights>Copyright © 2012 American Chemical Society</rights><rights>2014 INIST-CNRS</rights><rights>Copyright American Chemical Society Oct 22, 2012</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</citedby><cites>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/ci3003039$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/ci3003039$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,780,784,2763,27074,27922,27923,56736,56786</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26516179$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22978639$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kayala, Matthew A</creatorcontrib><creatorcontrib>Baldi, Pierre</creatorcontrib><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to >93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with >99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Chemical reactions</subject><subject>Chemistry</subject><subject>Chemistry, Pharmaceutical</subject><subject>Computer science; control theory; systems</subject><subject>Computer Simulation</subject><subject>Data processing. List processing. Character string processing</subject><subject>Drug Design</subject><subject>Exact sciences and technology</subject><subject>Free Radicals - chemistry</subject><subject>General and physical chemistry</subject><subject>General. Nomenclature, chemical documentation, computer chemistry</subject><subject>Hydrophobic and Hydrophilic Interactions</subject><subject>Informatics</subject><subject>Internet</subject><subject>Memory organisation. Data processing</subject><subject>Models, Chemical</subject><subject>Molecules</subject><subject>Organic chemistry</subject><subject>Organic Chemistry Phenomena</subject><subject>Software</subject><subject>Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNplkdtKAzEQhoMotlYvfAEJiKAX1WSzm028k-IJWhSx4N2SZic2dQ812Yq-vak9WBQGZvj55p9hBqFDSs4pieiFtowQRpjcQm2axLIrOXnZXtWJ5C205_0kMEzyaBe1okimgjPZRm9PoHRj6-rRQW51U7tLvCyDiGuDe3U5LeAT98ZQWq0KvOrwWDW4GQMegB6ryvrGatyHDyjw0NvqFQ-UHtsKgqZcFYR9tGNU4eFgmTtoeHP93Lvr9h9u73tX_a6KU950FRAjVCKThEghYz4yoGOuKUhBeASxFJTlXIwMjTkDExEapYxQnuc6YcKkrINOF75TV7_PwDdZab2GolAV1DOfURo6pYgoDejxH3RSz1wVtvuhRIh4bni2oLSrvXdgsqmzpXJfGSXZ_APZ-gOBPVo6zkYl5GtydfIAnCwB5cM5jVOVtv6X4wnlNN3glPYbW_0b-A0g6JfN</recordid><startdate>20121022</startdate><enddate>20121022</enddate><creator>Kayala, Matthew A</creator><creator>Baldi, Pierre</creator><general>American Chemical Society</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20121022</creationdate><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><author>Kayala, Matthew A ; Baldi, Pierre</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Chemical reactions</topic><topic>Chemistry</topic><topic>Chemistry, Pharmaceutical</topic><topic>Computer science; control theory; systems</topic><topic>Computer Simulation</topic><topic>Data processing. List processing. Character string processing</topic><topic>Drug Design</topic><topic>Exact sciences and technology</topic><topic>Free Radicals - chemistry</topic><topic>General and physical chemistry</topic><topic>General. Nomenclature, chemical documentation, computer chemistry</topic><topic>Hydrophobic and Hydrophilic Interactions</topic><topic>Informatics</topic><topic>Internet</topic><topic>Memory organisation. Data processing</topic><topic>Models, Chemical</topic><topic>Molecules</topic><topic>Organic chemistry</topic><topic>Organic Chemistry Phenomena</topic><topic>Software</topic><topic>Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kayala, Matthew A</creatorcontrib><creatorcontrib>Baldi, Pierre</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kayala, Matthew A</au><au>Baldi, Pierre</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2012-10-22</date><risdate>2012</risdate><volume>52</volume><issue>10</issue><spage>2526</spage><epage>2540</epage><pages>2526-2540</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to >93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with >99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</abstract><cop>Washington, DC</cop><pub>American Chemical Society</pub><pmid>22978639</pmid><doi>10.1021/ci3003039</doi><tpages>15</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1549-9596 |
ispartof | Journal of chemical information and modeling, 2012-10, Vol.52 (10), p.2526-2540 |
issn | 1549-9596 1549-960X |
language | eng |
recordid | cdi_proquest_miscellaneous_1114698211 |
source | MEDLINE; ACS Publications |
subjects | Algorithms Applied sciences Artificial Intelligence Chemical reactions Chemistry Chemistry, Pharmaceutical Computer science control theory systems Computer Simulation Data processing. List processing. Character string processing Drug Design Exact sciences and technology Free Radicals - chemistry General and physical chemistry General. Nomenclature, chemical documentation, computer chemistry Hydrophobic and Hydrophilic Interactions Informatics Internet Memory organisation. Data processing Models, Chemical Molecules Organic chemistry Organic Chemistry Phenomena Software Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry |
title | ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T16%3A51%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ReactionPredictor:%20Prediction%20of%20Complex%20Chemical%20Reactions%20at%20the%20Mechanistic%20Level%20Using%20Machine%20Learning&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Kayala,%20Matthew%20A&rft.date=2012-10-22&rft.volume=52&rft.issue=10&rft.spage=2526&rft.epage=2540&rft.pages=2526-2540&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/ci3003039&rft_dat=%3Cproquest_cross%3E2796899951%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1114814847&rft_id=info:pmid/22978639&rfr_iscdi=true |