Cocrystal virtual screening based on the XGBoost machine learning model

Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chinese chemical letters 2023-08, Vol.34 (8), p.107964-403, Article 107964
Hauptverfasser: Yang, Dezhi, Wang, Li, Yuan, Penghui, An, Qi, Su, Bin, Yu, Mingchao, Chen, Ting, Hu, Kun, Zhang, Li, Lu, Yang, Du, Guanhua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 403
container_issue 8
container_start_page 107964
container_title Chinese chemical letters
container_volume 34
creator Yang, Dezhi
Wang, Li
Yuan, Penghui
An, Qi
Su, Bin
Yu, Mingchao
Chen, Ting
Hu, Kun
Zhang, Li
Lu, Yang
Du, Guanhua
description Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]
doi_str_mv 10.1016/j.cclet.2022.107964
format Article
fullrecord <record><control><sourceid>wanfang_jour_cross</sourceid><recordid>TN_cdi_wanfang_journals_zghxkb202308065</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><wanfj_id>zghxkb202308065</wanfj_id><els_id>S1001841722009755</els_id><sourcerecordid>zghxkb202308065</sourcerecordid><originalsourceid>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</originalsourceid><addsrcrecordid>eNp9kL1OwzAUhS0EEqXwBCzZmFLsOLGdgQEqWpAqsYDEZrn2TeuQ2shOC-XpcRtmpnt0dc79-RC6JnhCMGG37UTrDvpJgYsidXjNyhM0IoKLvEr6NGmMSS5Kws_RRYwtxoUQlI3QfOp12MdeddnOhn6batQBwFm3ypYqgsm8y_o1ZO_zB-9jn22UXlsHWQcqHF0bb6C7RGeN6iJc_dUxeps9vk6f8sXL_Hl6v8g1pVWfM6E5K8oaqkaUAtJBSwKKEMyEqbkBXnPVaA06GWoOBgitakobZkrTVKDoGN0Mc7-Ua5RbydZvg0sb5c9q_f2xTAQoFphVyUkHpw4-xgCN_Ax2o8JeEiwP1GQrj9TkgZocqKXU3ZCC9MTOQpBRW3AajA2ge2m8_Tf_CxG6dx0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><source>Alma/SFX Local Collection</source><creator>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</creator><creatorcontrib>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</creatorcontrib><description>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</description><identifier>ISSN: 1001-8417</identifier><identifier>EISSN: 1878-5964</identifier><identifier>DOI: 10.1016/j.cclet.2022.107964</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Cocrystal ; Machine learning ; Molecular descriptor ; Nefiracetam ; Praziquantel ; XGBoost</subject><ispartof>Chinese chemical letters, 2023-08, Vol.34 (8), p.107964-403, Article 107964</ispartof><rights>2023</rights><rights>Copyright © Wanfang Data Co. Ltd. All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</citedby><cites>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</cites><orcidid>0000-0002-5677-0253 ; 0000-0002-3159-4126 ; 0000-0002-8833-7374 ; 0000-0001-5676-8551</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.wanfangdata.com.cn/images/PeriodicalImages/zghxkb/zghxkb.jpg</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Yang, Dezhi</creatorcontrib><creatorcontrib>Wang, Li</creatorcontrib><creatorcontrib>Yuan, Penghui</creatorcontrib><creatorcontrib>An, Qi</creatorcontrib><creatorcontrib>Su, Bin</creatorcontrib><creatorcontrib>Yu, Mingchao</creatorcontrib><creatorcontrib>Chen, Ting</creatorcontrib><creatorcontrib>Hu, Kun</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Lu, Yang</creatorcontrib><creatorcontrib>Du, Guanhua</creatorcontrib><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><title>Chinese chemical letters</title><description>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</description><subject>Cocrystal</subject><subject>Machine learning</subject><subject>Molecular descriptor</subject><subject>Nefiracetam</subject><subject>Praziquantel</subject><subject>XGBoost</subject><issn>1001-8417</issn><issn>1878-5964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kL1OwzAUhS0EEqXwBCzZmFLsOLGdgQEqWpAqsYDEZrn2TeuQ2shOC-XpcRtmpnt0dc79-RC6JnhCMGG37UTrDvpJgYsidXjNyhM0IoKLvEr6NGmMSS5Kws_RRYwtxoUQlI3QfOp12MdeddnOhn6batQBwFm3ypYqgsm8y_o1ZO_zB-9jn22UXlsHWQcqHF0bb6C7RGeN6iJc_dUxeps9vk6f8sXL_Hl6v8g1pVWfM6E5K8oaqkaUAtJBSwKKEMyEqbkBXnPVaA06GWoOBgitakobZkrTVKDoGN0Mc7-Ua5RbydZvg0sb5c9q_f2xTAQoFphVyUkHpw4-xgCN_Ax2o8JeEiwP1GQrj9TkgZocqKXU3ZCC9MTOQpBRW3AajA2ge2m8_Tf_CxG6dx0</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Yang, Dezhi</creator><creator>Wang, Li</creator><creator>Yuan, Penghui</creator><creator>An, Qi</creator><creator>Su, Bin</creator><creator>Yu, Mingchao</creator><creator>Chen, Ting</creator><creator>Hu, Kun</creator><creator>Zhang, Li</creator><creator>Lu, Yang</creator><creator>Du, Guanhua</creator><general>Elsevier B.V</general><general>Beijing City Key Laboratory of Polymorphic Drugs,Center of Pharmaceutical Polymorphs,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China%Shandong Soteria Pharmaceutical Co.,Ltd.,Laiwu 271100,China%Beijing City Key Laboratory of Drug Target and Screening Research,National Center for Pharmaceutical Screening,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China</general><scope>AAYXX</scope><scope>CITATION</scope><scope>2B.</scope><scope>4A8</scope><scope>92I</scope><scope>93N</scope><scope>PSX</scope><scope>TCJ</scope><orcidid>https://orcid.org/0000-0002-5677-0253</orcidid><orcidid>https://orcid.org/0000-0002-3159-4126</orcidid><orcidid>https://orcid.org/0000-0002-8833-7374</orcidid><orcidid>https://orcid.org/0000-0001-5676-8551</orcidid></search><sort><creationdate>20230801</creationdate><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><author>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cocrystal</topic><topic>Machine learning</topic><topic>Molecular descriptor</topic><topic>Nefiracetam</topic><topic>Praziquantel</topic><topic>XGBoost</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Dezhi</creatorcontrib><creatorcontrib>Wang, Li</creatorcontrib><creatorcontrib>Yuan, Penghui</creatorcontrib><creatorcontrib>An, Qi</creatorcontrib><creatorcontrib>Su, Bin</creatorcontrib><creatorcontrib>Yu, Mingchao</creatorcontrib><creatorcontrib>Chen, Ting</creatorcontrib><creatorcontrib>Hu, Kun</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Lu, Yang</creatorcontrib><creatorcontrib>Du, Guanhua</creatorcontrib><collection>CrossRef</collection><collection>Wanfang Data Journals - Hong Kong</collection><collection>WANFANG Data Centre</collection><collection>Wanfang Data Journals</collection><collection>万方数据期刊 - 香港版</collection><collection>China Online Journals (COJ)</collection><collection>China Online Journals (COJ)</collection><jtitle>Chinese chemical letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Dezhi</au><au>Wang, Li</au><au>Yuan, Penghui</au><au>An, Qi</au><au>Su, Bin</au><au>Yu, Mingchao</au><au>Chen, Ting</au><au>Hu, Kun</au><au>Zhang, Li</au><au>Lu, Yang</au><au>Du, Guanhua</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cocrystal virtual screening based on the XGBoost machine learning model</atitle><jtitle>Chinese chemical letters</jtitle><date>2023-08-01</date><risdate>2023</risdate><volume>34</volume><issue>8</issue><spage>107964</spage><epage>403</epage><pages>107964-403</pages><artnum>107964</artnum><issn>1001-8417</issn><eissn>1878-5964</eissn><abstract>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.cclet.2022.107964</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0002-5677-0253</orcidid><orcidid>https://orcid.org/0000-0002-3159-4126</orcidid><orcidid>https://orcid.org/0000-0002-8833-7374</orcidid><orcidid>https://orcid.org/0000-0001-5676-8551</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1001-8417
ispartof Chinese chemical letters, 2023-08, Vol.34 (8), p.107964-403, Article 107964
issn 1001-8417
1878-5964
language eng
recordid cdi_wanfang_journals_zghxkb202308065
source Alma/SFX Local Collection
subjects Cocrystal
Machine learning
Molecular descriptor
Nefiracetam
Praziquantel
XGBoost
title Cocrystal virtual screening based on the XGBoost machine learning model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T22%3A08%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wanfang_jour_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cocrystal%20virtual%20screening%20based%20on%20the%20XGBoost%20machine%20learning%20model&rft.jtitle=Chinese%20chemical%20letters&rft.au=Yang,%20Dezhi&rft.date=2023-08-01&rft.volume=34&rft.issue=8&rft.spage=107964&rft.epage=403&rft.pages=107964-403&rft.artnum=107964&rft.issn=1001-8417&rft.eissn=1878-5964&rft_id=info:doi/10.1016/j.cclet.2022.107964&rft_dat=%3Cwanfang_jour_cross%3Ezghxkb202308065%3C/wanfang_jour_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_wanfj_id=zghxkb202308065&rft_els_id=S1001841722009755&rfr_iscdi=true