Cocrystal virtual screening based on the XGBoost machine learning model
Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of...
Gespeichert in:
Veröffentlicht in: | Chinese chemical letters 2023-08, Vol.34 (8), p.107964-403, Article 107964 |
---|---|
Hauptverfasser: | , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 403 |
---|---|
container_issue | 8 |
container_start_page | 107964 |
container_title | Chinese chemical letters |
container_volume | 34 |
creator | Yang, Dezhi Wang, Li Yuan, Penghui An, Qi Su, Bin Yu, Mingchao Chen, Ting Hu, Kun Zhang, Li Lu, Yang Du, Guanhua |
description | Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately.
This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted] |
doi_str_mv | 10.1016/j.cclet.2022.107964 |
format | Article |
fullrecord | <record><control><sourceid>wanfang_jour_cross</sourceid><recordid>TN_cdi_wanfang_journals_zghxkb202308065</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><wanfj_id>zghxkb202308065</wanfj_id><els_id>S1001841722009755</els_id><sourcerecordid>zghxkb202308065</sourcerecordid><originalsourceid>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</originalsourceid><addsrcrecordid>eNp9kL1OwzAUhS0EEqXwBCzZmFLsOLGdgQEqWpAqsYDEZrn2TeuQ2shOC-XpcRtmpnt0dc79-RC6JnhCMGG37UTrDvpJgYsidXjNyhM0IoKLvEr6NGmMSS5Kws_RRYwtxoUQlI3QfOp12MdeddnOhn6batQBwFm3ypYqgsm8y_o1ZO_zB-9jn22UXlsHWQcqHF0bb6C7RGeN6iJc_dUxeps9vk6f8sXL_Hl6v8g1pVWfM6E5K8oaqkaUAtJBSwKKEMyEqbkBXnPVaA06GWoOBgitakobZkrTVKDoGN0Mc7-Ua5RbydZvg0sb5c9q_f2xTAQoFphVyUkHpw4-xgCN_Ax2o8JeEiwP1GQrj9TkgZocqKXU3ZCC9MTOQpBRW3AajA2ge2m8_Tf_CxG6dx0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><source>Alma/SFX Local Collection</source><creator>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</creator><creatorcontrib>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</creatorcontrib><description>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately.
This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</description><identifier>ISSN: 1001-8417</identifier><identifier>EISSN: 1878-5964</identifier><identifier>DOI: 10.1016/j.cclet.2022.107964</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Cocrystal ; Machine learning ; Molecular descriptor ; Nefiracetam ; Praziquantel ; XGBoost</subject><ispartof>Chinese chemical letters, 2023-08, Vol.34 (8), p.107964-403, Article 107964</ispartof><rights>2023</rights><rights>Copyright © Wanfang Data Co. Ltd. All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</citedby><cites>FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</cites><orcidid>0000-0002-5677-0253 ; 0000-0002-3159-4126 ; 0000-0002-8833-7374 ; 0000-0001-5676-8551</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.wanfangdata.com.cn/images/PeriodicalImages/zghxkb/zghxkb.jpg</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Yang, Dezhi</creatorcontrib><creatorcontrib>Wang, Li</creatorcontrib><creatorcontrib>Yuan, Penghui</creatorcontrib><creatorcontrib>An, Qi</creatorcontrib><creatorcontrib>Su, Bin</creatorcontrib><creatorcontrib>Yu, Mingchao</creatorcontrib><creatorcontrib>Chen, Ting</creatorcontrib><creatorcontrib>Hu, Kun</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Lu, Yang</creatorcontrib><creatorcontrib>Du, Guanhua</creatorcontrib><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><title>Chinese chemical letters</title><description>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately.
This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</description><subject>Cocrystal</subject><subject>Machine learning</subject><subject>Molecular descriptor</subject><subject>Nefiracetam</subject><subject>Praziquantel</subject><subject>XGBoost</subject><issn>1001-8417</issn><issn>1878-5964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kL1OwzAUhS0EEqXwBCzZmFLsOLGdgQEqWpAqsYDEZrn2TeuQ2shOC-XpcRtmpnt0dc79-RC6JnhCMGG37UTrDvpJgYsidXjNyhM0IoKLvEr6NGmMSS5Kws_RRYwtxoUQlI3QfOp12MdeddnOhn6batQBwFm3ypYqgsm8y_o1ZO_zB-9jn22UXlsHWQcqHF0bb6C7RGeN6iJc_dUxeps9vk6f8sXL_Hl6v8g1pVWfM6E5K8oaqkaUAtJBSwKKEMyEqbkBXnPVaA06GWoOBgitakobZkrTVKDoGN0Mc7-Ua5RbydZvg0sb5c9q_f2xTAQoFphVyUkHpw4-xgCN_Ax2o8JeEiwP1GQrj9TkgZocqKXU3ZCC9MTOQpBRW3AajA2ge2m8_Tf_CxG6dx0</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Yang, Dezhi</creator><creator>Wang, Li</creator><creator>Yuan, Penghui</creator><creator>An, Qi</creator><creator>Su, Bin</creator><creator>Yu, Mingchao</creator><creator>Chen, Ting</creator><creator>Hu, Kun</creator><creator>Zhang, Li</creator><creator>Lu, Yang</creator><creator>Du, Guanhua</creator><general>Elsevier B.V</general><general>Beijing City Key Laboratory of Polymorphic Drugs,Center of Pharmaceutical Polymorphs,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China%Shandong Soteria Pharmaceutical Co.,Ltd.,Laiwu 271100,China%Beijing City Key Laboratory of Drug Target and Screening Research,National Center for Pharmaceutical Screening,Institute of Materia Medica,Chinese Academy of Medical Sciences and Peking Union Medical College,Beijing 100050,China</general><scope>AAYXX</scope><scope>CITATION</scope><scope>2B.</scope><scope>4A8</scope><scope>92I</scope><scope>93N</scope><scope>PSX</scope><scope>TCJ</scope><orcidid>https://orcid.org/0000-0002-5677-0253</orcidid><orcidid>https://orcid.org/0000-0002-3159-4126</orcidid><orcidid>https://orcid.org/0000-0002-8833-7374</orcidid><orcidid>https://orcid.org/0000-0001-5676-8551</orcidid></search><sort><creationdate>20230801</creationdate><title>Cocrystal virtual screening based on the XGBoost machine learning model</title><author>Yang, Dezhi ; Wang, Li ; Yuan, Penghui ; An, Qi ; Su, Bin ; Yu, Mingchao ; Chen, Ting ; Hu, Kun ; Zhang, Li ; Lu, Yang ; Du, Guanhua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c335t-68c76249e5f848e841b1ea11068d97de797afccece5f97ede135933f6d4df5ea3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cocrystal</topic><topic>Machine learning</topic><topic>Molecular descriptor</topic><topic>Nefiracetam</topic><topic>Praziquantel</topic><topic>XGBoost</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Dezhi</creatorcontrib><creatorcontrib>Wang, Li</creatorcontrib><creatorcontrib>Yuan, Penghui</creatorcontrib><creatorcontrib>An, Qi</creatorcontrib><creatorcontrib>Su, Bin</creatorcontrib><creatorcontrib>Yu, Mingchao</creatorcontrib><creatorcontrib>Chen, Ting</creatorcontrib><creatorcontrib>Hu, Kun</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Lu, Yang</creatorcontrib><creatorcontrib>Du, Guanhua</creatorcontrib><collection>CrossRef</collection><collection>Wanfang Data Journals - Hong Kong</collection><collection>WANFANG Data Centre</collection><collection>Wanfang Data Journals</collection><collection>万方数据期刊 - 香港版</collection><collection>China Online Journals (COJ)</collection><collection>China Online Journals (COJ)</collection><jtitle>Chinese chemical letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Dezhi</au><au>Wang, Li</au><au>Yuan, Penghui</au><au>An, Qi</au><au>Su, Bin</au><au>Yu, Mingchao</au><au>Chen, Ting</au><au>Hu, Kun</au><au>Zhang, Li</au><au>Lu, Yang</au><au>Du, Guanhua</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cocrystal virtual screening based on the XGBoost machine learning model</atitle><jtitle>Chinese chemical letters</jtitle><date>2023-08-01</date><risdate>2023</risdate><volume>34</volume><issue>8</issue><spage>107964</spage><epage>403</epage><pages>107964-403</pages><artnum>107964</artnum><issn>1001-8417</issn><eissn>1878-5964</eissn><abstract>Co-crystal formation can improve the physicochemical properties of a compound, thus enhancing its druggability. Therefore, artificial intelligence-based co-crystal virtual screening in the early stage of drug development has attracted extensive attention from researchers. However, the complexity of developing and applying algorithms hinders it wide application. This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model of the scikit-learn package. The simplified molecular input line entry specification (SMILES) information of two compounds is simply inputted to determine whether a co-crystal can be formed. The data set includs the co-crystal records presented in the Cambridge Structural Database (CSD) and the records of no co-crystal formation from extant literature and experiments. RDKit molecular descriptors are adopted as the features of a compound in the data set. The developed model shows excellent performance in the proposed co-crystal training and validation sets with high accuracy, sensitivity, and F1 score. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately.
This study presents a data-driven co-crystal prediction method based on the XGBoost machine learning model. The simplified molecular input line entry specification information of two compounds is simply inputted to determine whether a co-crystal can be formed. The prediction success rate of the model exceeds 90%. The model therefore provides a simple and feasible scheme for designing and screening co-crystal drugs efficiently and accurately. [Display omitted]</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.cclet.2022.107964</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0002-5677-0253</orcidid><orcidid>https://orcid.org/0000-0002-3159-4126</orcidid><orcidid>https://orcid.org/0000-0002-8833-7374</orcidid><orcidid>https://orcid.org/0000-0001-5676-8551</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1001-8417 |
ispartof | Chinese chemical letters, 2023-08, Vol.34 (8), p.107964-403, Article 107964 |
issn | 1001-8417 1878-5964 |
language | eng |
recordid | cdi_wanfang_journals_zghxkb202308065 |
source | Alma/SFX Local Collection |
subjects | Cocrystal Machine learning Molecular descriptor Nefiracetam Praziquantel XGBoost |
title | Cocrystal virtual screening based on the XGBoost machine learning model |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T22%3A08%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wanfang_jour_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cocrystal%20virtual%20screening%20based%20on%20the%20XGBoost%20machine%20learning%20model&rft.jtitle=Chinese%20chemical%20letters&rft.au=Yang,%20Dezhi&rft.date=2023-08-01&rft.volume=34&rft.issue=8&rft.spage=107964&rft.epage=403&rft.pages=107964-403&rft.artnum=107964&rft.issn=1001-8417&rft.eissn=1878-5964&rft_id=info:doi/10.1016/j.cclet.2022.107964&rft_dat=%3Cwanfang_jour_cross%3Ezghxkb202308065%3C/wanfang_jour_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_wanfj_id=zghxkb202308065&rft_els_id=S1001841722009755&rfr_iscdi=true |