Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism

The copper­(I)-catalyzed alkyne–azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of the CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemical information and modeling 2020-03, Vol.60 (3), p.1165-1174
Hauptverfasser: Su, Shimin, Yang, Yuyao, Gan, Hanlin, Zheng, Shuangjia, Gu, Fenglong, Zhao, Chao, Xu, Jun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1174
container_issue 3
container_start_page 1165
container_title Journal of chemical information and modeling
container_volume 60
creator Su, Shimin
Yang, Yuyao
Gan, Hanlin
Zheng, Shuangjia
Gu, Fenglong
Zhao, Chao
Xu, Jun
description The copper­(I)-catalyzed alkyne–azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of the CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a recurrent neural network (RNN) model to predict its feasibility. First, we designed and synthesized a structurally diverse library of 700 compounds with the CuAAC reaction to obtain experimental data. Then, using reaction SMILES as input, we generated a bidirectional long–short-term memory with a self-attention mechanism (BiLSTM-SA) model. Our best prediction model has total accuracy of 80%. With the self-attention mechanism, adverse substructures responsible for negative reactions were recognized and derived as quantitative descriptors. Density functional theory investigations were conducted to provide evidence for the correlation between bromo-α-C hybrid types and the success rate of the reaction. Quantitative descriptors combined with RDKit descriptors were fed to three machine learning models, a support vector machine, random forest, and logistic regression, and resulted in improved performance. The BiLSTM-SA model for predicting the feasibility of the CuAAC reaction is superior to other conventional learning methods and advances heuristic chemical rules.
doi_str_mv 10.1021/acs.jcim.9b00929
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2350907114</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2350907114</sourcerecordid><originalsourceid>FETCH-LOGICAL-a364t-f0984d79e06a0eb7870d925d7cef5a410df0f8e5e5315254289c5fab8e9e70a23</originalsourceid><addsrcrecordid>eNp1kc1u1DAURiMEoqWwZ4UssSlSM1w7cRIvRxGlldqCgErsIse-YTzNz9R2VKUr3oFVX69PgsPMsEBida_s8x1b-qLoNYUFBUbfS-UWa2W6hagBBBNPokPKUxGLDL4_3e9cZAfRC-fWAEkiMvY8OkgY0CSl4jB6-GxRG-VN_4P4FZJTlM7UpjV-IkNDymGzQXt8_i4upZftdI-aLNubqcfHn7-W90YjKSfVDlJr483Qky8o1bw4cu1mpwwnarQWe0-ucLSyDcPfDfaG3Bm_CvdfsW3ipfeBmAWXqFayN657GT1rZOvw1W4eRdenH76VZ_HFp4_n5fIilkmW-rgBUaQ6FwiZBKzzIgctGNe5wobLlIJuoCmQI08oZzxlhVC8kXWBAnOQLDmKjrfejR1uR3S-6oxT2Layx2F0FUs4CMgpTQP69h90PYy2D78LlIAs4zmHQMGWUnZwzmJTbazppJ0qCtVcWxVqq-baql1tIfJmJx7rDvXfwL6nAJxsgT_R_aP_9f0GuRSmTQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2390665750</pqid></control><display><type>article</type><title>Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism</title><source>ACS Publications</source><creator>Su, Shimin ; Yang, Yuyao ; Gan, Hanlin ; Zheng, Shuangjia ; Gu, Fenglong ; Zhao, Chao ; Xu, Jun</creator><creatorcontrib>Su, Shimin ; Yang, Yuyao ; Gan, Hanlin ; Zheng, Shuangjia ; Gu, Fenglong ; Zhao, Chao ; Xu, Jun</creatorcontrib><description>The copper­(I)-catalyzed alkyne–azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of the CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a recurrent neural network (RNN) model to predict its feasibility. First, we designed and synthesized a structurally diverse library of 700 compounds with the CuAAC reaction to obtain experimental data. Then, using reaction SMILES as input, we generated a bidirectional long–short-term memory with a self-attention mechanism (BiLSTM-SA) model. Our best prediction model has total accuracy of 80%. With the self-attention mechanism, adverse substructures responsible for negative reactions were recognized and derived as quantitative descriptors. Density functional theory investigations were conducted to provide evidence for the correlation between bromo-α-C hybrid types and the success rate of the reaction. Quantitative descriptors combined with RDKit descriptors were fed to three machine learning models, a support vector machine, random forest, and logistic regression, and resulted in improved performance. The BiLSTM-SA model for predicting the feasibility of the CuAAC reaction is superior to other conventional learning methods and advances heuristic chemical rules.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/acs.jcim.9b00929</identifier><identifier>PMID: 32013419</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><subject>Alkynes ; Chemical reactions ; Chemical synthesis ; Copper ; Cycloaddition ; Density functional theory ; Feasibility ; Heuristic methods ; Machine learning ; Model accuracy ; Neural networks ; Prediction models ; Recurrent neural networks ; Substructures ; Support vector machines</subject><ispartof>Journal of chemical information and modeling, 2020-03, Vol.60 (3), p.1165-1174</ispartof><rights>Copyright American Chemical Society Mar 23, 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a364t-f0984d79e06a0eb7870d925d7cef5a410df0f8e5e5315254289c5fab8e9e70a23</citedby><cites>FETCH-LOGICAL-a364t-f0984d79e06a0eb7870d925d7cef5a410df0f8e5e5315254289c5fab8e9e70a23</cites><orcidid>0000-0002-4628-2440 ; 0000-0002-1075-0337 ; 0000-0001-5356-0157 ; 0000-0001-9747-4285</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.jcim.9b00929$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.jcim.9b00929$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,777,781,2752,27057,27905,27906,56719,56769</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32013419$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Su, Shimin</creatorcontrib><creatorcontrib>Yang, Yuyao</creatorcontrib><creatorcontrib>Gan, Hanlin</creatorcontrib><creatorcontrib>Zheng, Shuangjia</creatorcontrib><creatorcontrib>Gu, Fenglong</creatorcontrib><creatorcontrib>Zhao, Chao</creatorcontrib><creatorcontrib>Xu, Jun</creatorcontrib><title>Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism</title><title>Journal of chemical information and modeling</title><addtitle>J. Chem. Inf. Model</addtitle><description>The copper­(I)-catalyzed alkyne–azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of the CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a recurrent neural network (RNN) model to predict its feasibility. First, we designed and synthesized a structurally diverse library of 700 compounds with the CuAAC reaction to obtain experimental data. Then, using reaction SMILES as input, we generated a bidirectional long–short-term memory with a self-attention mechanism (BiLSTM-SA) model. Our best prediction model has total accuracy of 80%. With the self-attention mechanism, adverse substructures responsible for negative reactions were recognized and derived as quantitative descriptors. Density functional theory investigations were conducted to provide evidence for the correlation between bromo-α-C hybrid types and the success rate of the reaction. Quantitative descriptors combined with RDKit descriptors were fed to three machine learning models, a support vector machine, random forest, and logistic regression, and resulted in improved performance. The BiLSTM-SA model for predicting the feasibility of the CuAAC reaction is superior to other conventional learning methods and advances heuristic chemical rules.</description><subject>Alkynes</subject><subject>Chemical reactions</subject><subject>Chemical synthesis</subject><subject>Copper</subject><subject>Cycloaddition</subject><subject>Density functional theory</subject><subject>Feasibility</subject><subject>Heuristic methods</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Neural networks</subject><subject>Prediction models</subject><subject>Recurrent neural networks</subject><subject>Substructures</subject><subject>Support vector machines</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kc1u1DAURiMEoqWwZ4UssSlSM1w7cRIvRxGlldqCgErsIse-YTzNz9R2VKUr3oFVX69PgsPMsEBida_s8x1b-qLoNYUFBUbfS-UWa2W6hagBBBNPokPKUxGLDL4_3e9cZAfRC-fWAEkiMvY8OkgY0CSl4jB6-GxRG-VN_4P4FZJTlM7UpjV-IkNDymGzQXt8_i4upZftdI-aLNubqcfHn7-W90YjKSfVDlJr483Qky8o1bw4cu1mpwwnarQWe0-ucLSyDcPfDfaG3Bm_CvdfsW3ipfeBmAWXqFayN657GT1rZOvw1W4eRdenH76VZ_HFp4_n5fIilkmW-rgBUaQ6FwiZBKzzIgctGNe5wobLlIJuoCmQI08oZzxlhVC8kXWBAnOQLDmKjrfejR1uR3S-6oxT2Layx2F0FUs4CMgpTQP69h90PYy2D78LlIAs4zmHQMGWUnZwzmJTbazppJ0qCtVcWxVqq-baql1tIfJmJx7rDvXfwL6nAJxsgT_R_aP_9f0GuRSmTQ</recordid><startdate>20200323</startdate><enddate>20200323</enddate><creator>Su, Shimin</creator><creator>Yang, Yuyao</creator><creator>Gan, Hanlin</creator><creator>Zheng, Shuangjia</creator><creator>Gu, Fenglong</creator><creator>Zhao, Chao</creator><creator>Xu, Jun</creator><general>American Chemical Society</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4628-2440</orcidid><orcidid>https://orcid.org/0000-0002-1075-0337</orcidid><orcidid>https://orcid.org/0000-0001-5356-0157</orcidid><orcidid>https://orcid.org/0000-0001-9747-4285</orcidid></search><sort><creationdate>20200323</creationdate><title>Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism</title><author>Su, Shimin ; Yang, Yuyao ; Gan, Hanlin ; Zheng, Shuangjia ; Gu, Fenglong ; Zhao, Chao ; Xu, Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a364t-f0984d79e06a0eb7870d925d7cef5a410df0f8e5e5315254289c5fab8e9e70a23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Alkynes</topic><topic>Chemical reactions</topic><topic>Chemical synthesis</topic><topic>Copper</topic><topic>Cycloaddition</topic><topic>Density functional theory</topic><topic>Feasibility</topic><topic>Heuristic methods</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Neural networks</topic><topic>Prediction models</topic><topic>Recurrent neural networks</topic><topic>Substructures</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Su, Shimin</creatorcontrib><creatorcontrib>Yang, Yuyao</creatorcontrib><creatorcontrib>Gan, Hanlin</creatorcontrib><creatorcontrib>Zheng, Shuangjia</creatorcontrib><creatorcontrib>Gu, Fenglong</creatorcontrib><creatorcontrib>Zhao, Chao</creatorcontrib><creatorcontrib>Xu, Jun</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Su, Shimin</au><au>Yang, Yuyao</au><au>Gan, Hanlin</au><au>Zheng, Shuangjia</au><au>Gu, Fenglong</au><au>Zhao, Chao</au><au>Xu, Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism</atitle><jtitle>Journal of chemical information and modeling</jtitle><addtitle>J. Chem. Inf. Model</addtitle><date>2020-03-23</date><risdate>2020</risdate><volume>60</volume><issue>3</issue><spage>1165</spage><epage>1174</epage><pages>1165-1174</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>The copper­(I)-catalyzed alkyne–azide cycloaddition (CuAAC) reaction, a major click chemistry reaction, is widely employed in drug discovery and chemical biology. However, the success rate of the CuAAC reaction is not satisfactory as expected, and in order to improve its performance, we developed a recurrent neural network (RNN) model to predict its feasibility. First, we designed and synthesized a structurally diverse library of 700 compounds with the CuAAC reaction to obtain experimental data. Then, using reaction SMILES as input, we generated a bidirectional long–short-term memory with a self-attention mechanism (BiLSTM-SA) model. Our best prediction model has total accuracy of 80%. With the self-attention mechanism, adverse substructures responsible for negative reactions were recognized and derived as quantitative descriptors. Density functional theory investigations were conducted to provide evidence for the correlation between bromo-α-C hybrid types and the success rate of the reaction. Quantitative descriptors combined with RDKit descriptors were fed to three machine learning models, a support vector machine, random forest, and logistic regression, and resulted in improved performance. The BiLSTM-SA model for predicting the feasibility of the CuAAC reaction is superior to other conventional learning methods and advances heuristic chemical rules.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>32013419</pmid><doi>10.1021/acs.jcim.9b00929</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4628-2440</orcidid><orcidid>https://orcid.org/0000-0002-1075-0337</orcidid><orcidid>https://orcid.org/0000-0001-5356-0157</orcidid><orcidid>https://orcid.org/0000-0001-9747-4285</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1549-9596
ispartof Journal of chemical information and modeling, 2020-03, Vol.60 (3), p.1165-1174
issn 1549-9596
1549-960X
language eng
recordid cdi_proquest_miscellaneous_2350907114
source ACS Publications
subjects Alkynes
Chemical reactions
Chemical synthesis
Copper
Cycloaddition
Density functional theory
Feasibility
Heuristic methods
Machine learning
Model accuracy
Neural networks
Prediction models
Recurrent neural networks
Substructures
Support vector machines
title Predicting the Feasibility of Copper(I)-Catalyzed Alkyne–Azide Cycloaddition Reactions Using a Recurrent Neural Network with a Self-Attention Mechanism
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T01%3A17%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20the%20Feasibility%20of%20Copper(I)-Catalyzed%20Alkyne%E2%80%93Azide%20Cycloaddition%20Reactions%20Using%20a%20Recurrent%20Neural%20Network%20with%20a%20Self-Attention%20Mechanism&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Su,%20Shimin&rft.date=2020-03-23&rft.volume=60&rft.issue=3&rft.spage=1165&rft.epage=1174&rft.pages=1165-1174&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/acs.jcim.9b00929&rft_dat=%3Cproquest_cross%3E2350907114%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2390665750&rft_id=info:pmid/32013419&rfr_iscdi=true