XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking
Prediction of protein–ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein–ligand complexes using machine learning (ML). Due to the remarkable ability of ML metho...
Gespeichert in:
Veröffentlicht in: | ACS omega 2022-06, Vol.7 (25), p.21727-21735 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 21735 |
---|---|
container_issue | 25 |
container_start_page | 21727 |
container_title | ACS omega |
container_volume | 7 |
creator | Dong, Lina Qu, Xiaoyang Wang, Binju |
description | Prediction of protein–ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein–ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein–ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein–ligand complexes. |
doi_str_mv | 10.1021/acsomega.2c01723 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9245135</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2685038535</sourcerecordid><originalsourceid>FETCH-LOGICAL-a410t-4f5b64fbbb5373b84148253188e3132452b3f1939bdaf5a96da5bf57a3b48b473</originalsourceid><addsrcrecordid>eNp1kctKxDAUhoMoKureZZYuHM21TV0IIjMqVBQv4C4knWQm2iZj0grufAff0Cex44yiC1fnh_OdL5AfgF2MDjAi-FBVKTRmog5IhXBO6ArYJCxHA0wZXf2VN8BOSo8IIZwJIki2DjYozwUnebEJpg_l9Wh4BE_grWtmtYHKj-HQWlO17sXAS1VNnTewNCp65yfwtgpxPked74ngoQ0RXsfQGuc_3t5LN5kLvql5vlH-qc_bYM2qOpmd5dwC96Ph3en5oLw6uzg9KQeKYdQOmOU6Y1ZrzWlOtWCYCcIpFsJQTAnjRFOLC1rosbJcFdlYcW15rqhmQrOcboHjhXfW6caMK-PbqGo5i65R8VUG5eTfjXdTOQkvsujlmPJesLcUxPDcmdTKxqXK1LXyJnRJkkxwRAX_QtECrWJIKRr78wxGct6R_O5ILjvqT_YXJ_1GPoYu-v4z_sc_AX9FlOg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2685038535</pqid></control><display><type>article</type><title>XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>American Chemical Society (ACS) Open Access</source><source>PubMed Central</source><creator>Dong, Lina ; Qu, Xiaoyang ; Wang, Binju</creator><creatorcontrib>Dong, Lina ; Qu, Xiaoyang ; Wang, Binju</creatorcontrib><description>Prediction of protein–ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein–ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein–ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein–ligand complexes.</description><identifier>ISSN: 2470-1343</identifier><identifier>EISSN: 2470-1343</identifier><identifier>DOI: 10.1021/acsomega.2c01723</identifier><identifier>PMID: 35785279</identifier><language>eng</language><publisher>American Chemical Society</publisher><ispartof>ACS omega, 2022-06, Vol.7 (25), p.21727-21735</ispartof><rights>2022 The Authors. Published by American Chemical Society</rights><rights>2022 The Authors. Published by American Chemical Society 2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a410t-4f5b64fbbb5373b84148253188e3132452b3f1939bdaf5a96da5bf57a3b48b473</citedby><cites>FETCH-LOGICAL-a410t-4f5b64fbbb5373b84148253188e3132452b3f1939bdaf5a96da5bf57a3b48b473</cites><orcidid>0000-0002-3353-9411</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acsomega.2c01723$$EPDF$$P50$$Gacs$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acsomega.2c01723$$EHTML$$P50$$Gacs$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,27059,27903,27904,53769,53771,56740,56790</link.rule.ids></links><search><creatorcontrib>Dong, Lina</creatorcontrib><creatorcontrib>Qu, Xiaoyang</creatorcontrib><creatorcontrib>Wang, Binju</creatorcontrib><title>XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking</title><title>ACS omega</title><addtitle>ACS Omega</addtitle><description>Prediction of protein–ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein–ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein–ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein–ligand complexes.</description><issn>2470-1343</issn><issn>2470-1343</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>N~.</sourceid><recordid>eNp1kctKxDAUhoMoKureZZYuHM21TV0IIjMqVBQv4C4knWQm2iZj0grufAff0Cex44yiC1fnh_OdL5AfgF2MDjAi-FBVKTRmog5IhXBO6ArYJCxHA0wZXf2VN8BOSo8IIZwJIki2DjYozwUnebEJpg_l9Wh4BE_grWtmtYHKj-HQWlO17sXAS1VNnTewNCp65yfwtgpxPked74ngoQ0RXsfQGuc_3t5LN5kLvql5vlH-qc_bYM2qOpmd5dwC96Ph3en5oLw6uzg9KQeKYdQOmOU6Y1ZrzWlOtWCYCcIpFsJQTAnjRFOLC1rosbJcFdlYcW15rqhmQrOcboHjhXfW6caMK-PbqGo5i65R8VUG5eTfjXdTOQkvsujlmPJesLcUxPDcmdTKxqXK1LXyJnRJkkxwRAX_QtECrWJIKRr78wxGct6R_O5ILjvqT_YXJ_1GPoYu-v4z_sc_AX9FlOg</recordid><startdate>20220628</startdate><enddate>20220628</enddate><creator>Dong, Lina</creator><creator>Qu, Xiaoyang</creator><creator>Wang, Binju</creator><general>American Chemical Society</general><scope>N~.</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3353-9411</orcidid></search><sort><creationdate>20220628</creationdate><title>XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking</title><author>Dong, Lina ; Qu, Xiaoyang ; Wang, Binju</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a410t-4f5b64fbbb5373b84148253188e3132452b3f1939bdaf5a96da5bf57a3b48b473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Lina</creatorcontrib><creatorcontrib>Qu, Xiaoyang</creatorcontrib><creatorcontrib>Wang, Binju</creatorcontrib><collection>American Chemical Society (ACS) Open Access</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>ACS omega</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dong, Lina</au><au>Qu, Xiaoyang</au><au>Wang, Binju</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking</atitle><jtitle>ACS omega</jtitle><addtitle>ACS Omega</addtitle><date>2022-06-28</date><risdate>2022</risdate><volume>7</volume><issue>25</issue><spage>21727</spage><epage>21735</epage><pages>21727-21735</pages><issn>2470-1343</issn><eissn>2470-1343</eissn><abstract>Prediction of protein–ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein–ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein–ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein–ligand complexes.</abstract><pub>American Chemical Society</pub><pmid>35785279</pmid><doi>10.1021/acsomega.2c01723</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-3353-9411</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2470-1343 |
ispartof | ACS omega, 2022-06, Vol.7 (25), p.21727-21735 |
issn | 2470-1343 2470-1343 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9245135 |
source | DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; American Chemical Society (ACS) Open Access; PubMed Central |
title | XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein–Ligand Scoring and Ranking |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T03%3A53%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=XLPFE:%20A%20Simple%20and%20Effective%20Machine%20Learning%20Scoring%20Function%20for%20Protein%E2%80%93Ligand%20Scoring%20and%20Ranking&rft.jtitle=ACS%20omega&rft.au=Dong,%20Lina&rft.date=2022-06-28&rft.volume=7&rft.issue=25&rft.spage=21727&rft.epage=21735&rft.pages=21727-21735&rft.issn=2470-1343&rft.eissn=2470-1343&rft_id=info:doi/10.1021/acsomega.2c01723&rft_dat=%3Cproquest_pubme%3E2685038535%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2685038535&rft_id=info:pmid/35785279&rfr_iscdi=true |