AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins

Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2022-05, Vol.23 (3)
Hauptverfasser: Yin, Yueming, Hu, Haifeng, Yang, Zhen, Jiang, Feihu, Huang, Yihe, Wu, Jiansheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 3
container_start_page
container_title Briefings in bioinformatics
container_volume 23
creator Yin, Yueming
Hu, Haifeng
Yang, Zhen
Jiang, Feihu
Huang, Yihe
Wu, Jiansheng
description Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.
doi_str_mv 10.1093/bib/bbac077
format Article
fullrecord <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2644947510</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbac077</oup_id><sourcerecordid>2644947510</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</originalsourceid><addsrcrecordid>eNp90c1LHDEYBvBQlGq1p94lIJRCGTeTj8msN1lcWxAUP85DPt6ZRmYmY5KxtH99s-zWgwdPCeHHk5f3QehLSc5KsmQL7fRCa2WIlB_QYcmlLDgRfG9zr2QheMUO0KcYnwihRNblR3TABOO1qOkhihfr-8tznPxvFWzEbpiCf3FjhwdvoccdjBBU7_6q5PyIfYstwIS7oKZfuAcVxo3Nz73r1Gixdl6Z5F5cchBxUqGDtBFXt6s7nKMTuDEeo_1W9RE-784j9Li-fFj9KK5vrn6uLq4Lk6dLhWb1spJagJFSUEoYGGs5MURXtaVAtOGctFRIqFmpqNRL0oKlwkraloK17Ah92-bmj59niKkZXDTQ92oEP8eGVpwvuRQlyfT0DX3ycxjzdFlJwbOqaFbft8oEH2OAtpmCG1T405Sk2XTR5C6aXRdZn-wyZz2AfbX_l5_B1y3w8_Ru0j-HqpMK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2675447562</pqid></control><display><type>article</type><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><source>Oxford Journals Open Access Collection</source><creator>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</creator><creatorcontrib>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</creatorcontrib><description>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbac077</identifier><identifier>PMID: 35348582</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Biological activity ; Datasets ; Drug development ; G protein-coupled receptors ; Learning ; Ligands ; Molecular structure ; Multiple criterion ; Screening ; Smoothness ; Training</subject><ispartof>Briefings in bioinformatics, 2022-05, Vol.23 (3)</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</citedby><cites>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1604,27924,27925</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bib/bbac077$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35348582$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yin, Yueming</creatorcontrib><creatorcontrib>Hu, Haifeng</creatorcontrib><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Jiang, Feihu</creatorcontrib><creatorcontrib>Huang, Yihe</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</description><subject>Algorithms</subject><subject>Biological activity</subject><subject>Datasets</subject><subject>Drug development</subject><subject>G protein-coupled receptors</subject><subject>Learning</subject><subject>Ligands</subject><subject>Molecular structure</subject><subject>Multiple criterion</subject><subject>Screening</subject><subject>Smoothness</subject><subject>Training</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp90c1LHDEYBvBQlGq1p94lIJRCGTeTj8msN1lcWxAUP85DPt6ZRmYmY5KxtH99s-zWgwdPCeHHk5f3QehLSc5KsmQL7fRCa2WIlB_QYcmlLDgRfG9zr2QheMUO0KcYnwihRNblR3TABOO1qOkhihfr-8tznPxvFWzEbpiCf3FjhwdvoccdjBBU7_6q5PyIfYstwIS7oKZfuAcVxo3Nz73r1Gixdl6Z5F5cchBxUqGDtBFXt6s7nKMTuDEeo_1W9RE-784j9Li-fFj9KK5vrn6uLq4Lk6dLhWb1spJagJFSUEoYGGs5MURXtaVAtOGctFRIqFmpqNRL0oKlwkraloK17Ah92-bmj59niKkZXDTQ92oEP8eGVpwvuRQlyfT0DX3ycxjzdFlJwbOqaFbft8oEH2OAtpmCG1T405Sk2XTR5C6aXRdZn-wyZz2AfbX_l5_B1y3w8_Ru0j-HqpMK</recordid><startdate>20220513</startdate><enddate>20220513</enddate><creator>Yin, Yueming</creator><creator>Hu, Haifeng</creator><creator>Yang, Zhen</creator><creator>Jiang, Feihu</creator><creator>Huang, Yihe</creator><creator>Wu, Jiansheng</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>20220513</creationdate><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><author>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Biological activity</topic><topic>Datasets</topic><topic>Drug development</topic><topic>G protein-coupled receptors</topic><topic>Learning</topic><topic>Ligands</topic><topic>Molecular structure</topic><topic>Multiple criterion</topic><topic>Screening</topic><topic>Smoothness</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yin, Yueming</creatorcontrib><creatorcontrib>Hu, Haifeng</creatorcontrib><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Jiang, Feihu</creatorcontrib><creatorcontrib>Huang, Yihe</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yin, Yueming</au><au>Hu, Haifeng</au><au>Yang, Zhen</au><au>Jiang, Feihu</au><au>Huang, Yihe</au><au>Wu, Jiansheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2022-05-13</date><risdate>2022</risdate><volume>23</volume><issue>3</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>35348582</pmid><doi>10.1093/bib/bbac077</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2022-05, Vol.23 (3)
issn 1467-5463
1477-4054
language eng
recordid cdi_proquest_miscellaneous_2644947510
source Oxford Journals Open Access Collection
subjects Algorithms
Biological activity
Datasets
Drug development
G protein-coupled receptors
Learning
Ligands
Molecular structure
Multiple criterion
Screening
Smoothness
Training
title AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T08%3A09%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AFSE:%20towards%20improving%20model%20generalization%20of%20deep%20graph%20learning%20of%20ligand%20bioactivities%20targeting%20GPCR%20proteins&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Yin,%20Yueming&rft.date=2022-05-13&rft.volume=23&rft.issue=3&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbac077&rft_dat=%3Cproquest_TOX%3E2644947510%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2675447562&rft_id=info:pmid/35348582&rft_oup_id=10.1093/bib/bbac077&rfr_iscdi=true