ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning

Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemical information and modeling 2012-10, Vol.52 (10), p.2526-2540
Hauptverfasser: Kayala, Matthew A, Baldi, Pierre
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2540
container_issue 10
container_start_page 2526
container_title Journal of chemical information and modeling
container_volume 52
creator Kayala, Matthew A
Baldi, Pierre
description Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to >93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with >99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.
doi_str_mv 10.1021/ci3003039
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1114698211</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2796899951</sourcerecordid><originalsourceid>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</originalsourceid><addsrcrecordid>eNplkdtKAzEQhoMotlYvfAEJiKAX1WSzm028k-IJWhSx4N2SZic2dQ812Yq-vak9WBQGZvj55p9hBqFDSs4pieiFtowQRpjcQm2axLIrOXnZXtWJ5C205_0kMEzyaBe1okimgjPZRm9PoHRj6-rRQW51U7tLvCyDiGuDe3U5LeAT98ZQWq0KvOrwWDW4GQMegB6ryvrGatyHDyjw0NvqFQ-UHtsKgqZcFYR9tGNU4eFgmTtoeHP93Lvr9h9u73tX_a6KU950FRAjVCKThEghYz4yoGOuKUhBeASxFJTlXIwMjTkDExEapYxQnuc6YcKkrINOF75TV7_PwDdZab2GolAV1DOfURo6pYgoDejxH3RSz1wVtvuhRIh4bni2oLSrvXdgsqmzpXJfGSXZ_APZ-gOBPVo6zkYl5GtydfIAnCwB5cM5jVOVtv6X4wnlNN3glPYbW_0b-A0g6JfN</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1114814847</pqid></control><display><type>article</type><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><source>MEDLINE</source><source>ACS Publications</source><creator>Kayala, Matthew A ; Baldi, Pierre</creator><creatorcontrib>Kayala, Matthew A ; Baldi, Pierre</creatorcontrib><description>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to &gt;93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with &gt;99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/ci3003039</identifier><identifier>PMID: 22978639</identifier><language>eng</language><publisher>Washington, DC: American Chemical Society</publisher><subject>Algorithms ; Applied sciences ; Artificial Intelligence ; Chemical reactions ; Chemistry ; Chemistry, Pharmaceutical ; Computer science; control theory; systems ; Computer Simulation ; Data processing. List processing. Character string processing ; Drug Design ; Exact sciences and technology ; Free Radicals - chemistry ; General and physical chemistry ; General. Nomenclature, chemical documentation, computer chemistry ; Hydrophobic and Hydrophilic Interactions ; Informatics ; Internet ; Memory organisation. Data processing ; Models, Chemical ; Molecules ; Organic chemistry ; Organic Chemistry Phenomena ; Software ; Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</subject><ispartof>Journal of chemical information and modeling, 2012-10, Vol.52 (10), p.2526-2540</ispartof><rights>Copyright © 2012 American Chemical Society</rights><rights>2014 INIST-CNRS</rights><rights>Copyright American Chemical Society Oct 22, 2012</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</citedby><cites>FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/ci3003039$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/ci3003039$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,780,784,2763,27074,27922,27923,56736,56786</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26516179$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22978639$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kayala, Matthew A</creatorcontrib><creatorcontrib>Baldi, Pierre</creatorcontrib><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to &gt;93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with &gt;99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Chemical reactions</subject><subject>Chemistry</subject><subject>Chemistry, Pharmaceutical</subject><subject>Computer science; control theory; systems</subject><subject>Computer Simulation</subject><subject>Data processing. List processing. Character string processing</subject><subject>Drug Design</subject><subject>Exact sciences and technology</subject><subject>Free Radicals - chemistry</subject><subject>General and physical chemistry</subject><subject>General. Nomenclature, chemical documentation, computer chemistry</subject><subject>Hydrophobic and Hydrophilic Interactions</subject><subject>Informatics</subject><subject>Internet</subject><subject>Memory organisation. Data processing</subject><subject>Models, Chemical</subject><subject>Molecules</subject><subject>Organic chemistry</subject><subject>Organic Chemistry Phenomena</subject><subject>Software</subject><subject>Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNplkdtKAzEQhoMotlYvfAEJiKAX1WSzm028k-IJWhSx4N2SZic2dQ812Yq-vak9WBQGZvj55p9hBqFDSs4pieiFtowQRpjcQm2axLIrOXnZXtWJ5C205_0kMEzyaBe1okimgjPZRm9PoHRj6-rRQW51U7tLvCyDiGuDe3U5LeAT98ZQWq0KvOrwWDW4GQMegB6ryvrGatyHDyjw0NvqFQ-UHtsKgqZcFYR9tGNU4eFgmTtoeHP93Lvr9h9u73tX_a6KU950FRAjVCKThEghYz4yoGOuKUhBeASxFJTlXIwMjTkDExEapYxQnuc6YcKkrINOF75TV7_PwDdZab2GolAV1DOfURo6pYgoDejxH3RSz1wVtvuhRIh4bni2oLSrvXdgsqmzpXJfGSXZ_APZ-gOBPVo6zkYl5GtydfIAnCwB5cM5jVOVtv6X4wnlNN3glPYbW_0b-A0g6JfN</recordid><startdate>20121022</startdate><enddate>20121022</enddate><creator>Kayala, Matthew A</creator><creator>Baldi, Pierre</creator><general>American Chemical Society</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20121022</creationdate><title>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</title><author>Kayala, Matthew A ; Baldi, Pierre</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a476t-ae0f8a5955098946bfec46c1e98062e49813d68bf1463ef201273016ddc538f73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Chemical reactions</topic><topic>Chemistry</topic><topic>Chemistry, Pharmaceutical</topic><topic>Computer science; control theory; systems</topic><topic>Computer Simulation</topic><topic>Data processing. List processing. Character string processing</topic><topic>Drug Design</topic><topic>Exact sciences and technology</topic><topic>Free Radicals - chemistry</topic><topic>General and physical chemistry</topic><topic>General. Nomenclature, chemical documentation, computer chemistry</topic><topic>Hydrophobic and Hydrophilic Interactions</topic><topic>Informatics</topic><topic>Internet</topic><topic>Memory organisation. Data processing</topic><topic>Models, Chemical</topic><topic>Molecules</topic><topic>Organic chemistry</topic><topic>Organic Chemistry Phenomena</topic><topic>Software</topic><topic>Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kayala, Matthew A</creatorcontrib><creatorcontrib>Baldi, Pierre</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kayala, Matthew A</au><au>Baldi, Pierre</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2012-10-22</date><risdate>2012</risdate><volume>52</volume><issue>10</issue><spage>2526</spage><epage>2540</epage><pages>2526-2540</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>Proposing reasonable mechanisms and predicting the course of chemical reactions is important to the practice of organic chemistry. Approaches to reaction prediction have historically used obfuscating representations and manually encoded patterns or rules. Here we present ReactionPredictor, a machine learning approach to reaction prediction that models elementary, mechanistic reactions as interactions between approximate molecular orbitals (MOs). A training data set of productive reactions known to occur at reasonable rates and yields and verified by inclusion in the literature or textbooks is derived from an existing rule-based system and expanded upon with manual curation from graduate level textbooks. Using this training data set of complex polar, hypervalent, radical, and pericyclic reactions, a two-stage machine learning prediction framework is trained and validated. In the first stage, filtering models trained at the level of individual MOs are used to reduce the space of possible reactions to consider. In the second stage, ranking models over the filtered space of possible reactions are used to order the reactions such that the productive reactions are the top ranked. The resulting model, ReactionPredictor, perfectly ranks polar reactions 78.1% of the time and recovers all productive reactions 95.7% of the time when allowing for small numbers of errors. Pericyclic and radical reactions are perfectly ranked 85.8% and 77.0% of the time, respectively, rising to &gt;93% recovery for both reaction types with a small number of allowed errors. Decisions about which of the polar, pericyclic, or radical reaction type ranking models to use can be made with &gt;99% accuracy. Finally, for multistep reaction pathways, we implement the first mechanistic pathway predictor using constrained tree-search to discover a set of reasonable mechanistic steps from given reactants to given products. Webserver implementations of both the single step and pathway versions of ReactionPredictor are available via the chemoinformatics portal http://cdb.ics.uci.edu/.</abstract><cop>Washington, DC</cop><pub>American Chemical Society</pub><pmid>22978639</pmid><doi>10.1021/ci3003039</doi><tpages>15</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1549-9596
ispartof Journal of chemical information and modeling, 2012-10, Vol.52 (10), p.2526-2540
issn 1549-9596
1549-960X
language eng
recordid cdi_proquest_miscellaneous_1114698211
source MEDLINE; ACS Publications
subjects Algorithms
Applied sciences
Artificial Intelligence
Chemical reactions
Chemistry
Chemistry, Pharmaceutical
Computer science
control theory
systems
Computer Simulation
Data processing. List processing. Character string processing
Drug Design
Exact sciences and technology
Free Radicals - chemistry
General and physical chemistry
General. Nomenclature, chemical documentation, computer chemistry
Hydrophobic and Hydrophilic Interactions
Informatics
Internet
Memory organisation. Data processing
Models, Chemical
Molecules
Organic chemistry
Organic Chemistry Phenomena
Software
Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry
title ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T16%3A51%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ReactionPredictor:%20Prediction%20of%20Complex%20Chemical%20Reactions%20at%20the%20Mechanistic%20Level%20Using%20Machine%20Learning&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Kayala,%20Matthew%20A&rft.date=2012-10-22&rft.volume=52&rft.issue=10&rft.spage=2526&rft.epage=2540&rft.pages=2526-2540&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/ci3003039&rft_dat=%3Cproquest_cross%3E2796899951%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1114814847&rft_id=info:pmid/22978639&rfr_iscdi=true