AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins

Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists c...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Briefings in bioinformatics 2022-05, Vol.23 (3)
Hauptverfasser:	Yin, Yueming, Hu, Haifeng, Yang, Zhen, Jiang, Feihu, Huang, Yihe, Wu, Jiansheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Biological activity Datasets Drug development G protein-coupled receptors Learning Ligands Molecular structure Multiple criterion Screening Smoothness Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	3
container_start_page
container_title	Briefings in bioinformatics
container_volume	23
creator	Yin, Yueming Hu, Haifeng Yang, Zhen Jiang, Feihu Huang, Yihe Wu, Jiansheng
description	Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.
doi_str_mv	10.1093/bib/bbac077
format	Article
fullrecord	<record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2644947510</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbac077</oup_id><sourcerecordid>2644947510</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</originalsourceid><addsrcrecordid>eNp90c1LHDEYBvBQlGq1p94lIJRCGTeTj8msN1lcWxAUP85DPt6ZRmYmY5KxtH99s-zWgwdPCeHHk5f3QehLSc5KsmQL7fRCa2WIlB_QYcmlLDgRfG9zr2QheMUO0KcYnwihRNblR3TABOO1qOkhihfr-8tznPxvFWzEbpiCf3FjhwdvoccdjBBU7_6q5PyIfYstwIS7oKZfuAcVxo3Nz73r1Gixdl6Z5F5cchBxUqGDtBFXt6s7nKMTuDEeo_1W9RE-784j9Li-fFj9KK5vrn6uLq4Lk6dLhWb1spJagJFSUEoYGGs5MURXtaVAtOGctFRIqFmpqNRL0oKlwkraloK17Ah92-bmj59niKkZXDTQ92oEP8eGVpwvuRQlyfT0DX3ycxjzdFlJwbOqaFbft8oEH2OAtpmCG1T405Sk2XTR5C6aXRdZn-wyZz2AfbX_l5_B1y3w8_Ru0j-HqpMK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2675447562</pqid></control><display><type>article</type><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><source>Oxford Journals Open Access Collection</source><creator>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</creator><creatorcontrib>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</creatorcontrib><description>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbac077</identifier><identifier>PMID: 35348582</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Biological activity ; Datasets ; Drug development ; G protein-coupled receptors ; Learning ; Ligands ; Molecular structure ; Multiple criterion ; Screening ; Smoothness ; Training</subject><ispartof>Briefings in bioinformatics, 2022-05, Vol.23 (3)</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><rights>The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</citedby><cites>FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1604,27924,27925</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bib/bbac077$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35348582$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yin, Yueming</creatorcontrib><creatorcontrib>Hu, Haifeng</creatorcontrib><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Jiang, Feihu</creatorcontrib><creatorcontrib>Huang, Yihe</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</description><subject>Algorithms</subject><subject>Biological activity</subject><subject>Datasets</subject><subject>Drug development</subject><subject>G protein-coupled receptors</subject><subject>Learning</subject><subject>Ligands</subject><subject>Molecular structure</subject><subject>Multiple criterion</subject><subject>Screening</subject><subject>Smoothness</subject><subject>Training</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp90c1LHDEYBvBQlGq1p94lIJRCGTeTj8msN1lcWxAUP85DPt6ZRmYmY5KxtH99s-zWgwdPCeHHk5f3QehLSc5KsmQL7fRCa2WIlB_QYcmlLDgRfG9zr2QheMUO0KcYnwihRNblR3TABOO1qOkhihfr-8tznPxvFWzEbpiCf3FjhwdvoccdjBBU7_6q5PyIfYstwIS7oKZfuAcVxo3Nz73r1Gixdl6Z5F5cchBxUqGDtBFXt6s7nKMTuDEeo_1W9RE-784j9Li-fFj9KK5vrn6uLq4Lk6dLhWb1spJagJFSUEoYGGs5MURXtaVAtOGctFRIqFmpqNRL0oKlwkraloK17Ah92-bmj59niKkZXDTQ92oEP8eGVpwvuRQlyfT0DX3ycxjzdFlJwbOqaFbft8oEH2OAtpmCG1T405Sk2XTR5C6aXRdZn-wyZz2AfbX_l5_B1y3w8_Ru0j-HqpMK</recordid><startdate>20220513</startdate><enddate>20220513</enddate><creator>Yin, Yueming</creator><creator>Hu, Haifeng</creator><creator>Yang, Zhen</creator><creator>Jiang, Feihu</creator><creator>Huang, Yihe</creator><creator>Wu, Jiansheng</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>20220513</creationdate><title>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</title><author>Yin, Yueming ; Hu, Haifeng ; Yang, Zhen ; Jiang, Feihu ; Huang, Yihe ; Wu, Jiansheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-b38967b5ec7752203ecdd40c0b68d2e0bc440f257e831a27b90fed25d72f153f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Biological activity</topic><topic>Datasets</topic><topic>Drug development</topic><topic>G protein-coupled receptors</topic><topic>Learning</topic><topic>Ligands</topic><topic>Molecular structure</topic><topic>Multiple criterion</topic><topic>Screening</topic><topic>Smoothness</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yin, Yueming</creatorcontrib><creatorcontrib>Hu, Haifeng</creatorcontrib><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Jiang, Feihu</creatorcontrib><creatorcontrib>Huang, Yihe</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yin, Yueming</au><au>Hu, Haifeng</au><au>Yang, Zhen</au><au>Jiang, Feihu</au><au>Huang, Yihe</au><au>Wu, Jiansheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2022-05-13</date><risdate>2022</risdate><volume>23</volume><issue>3</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>35348582</pmid><doi>10.1093/bib/bbac077</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1467-5463
ispartof	Briefings in bioinformatics, 2022-05, Vol.23 (3)
issn	1467-5463 1477-4054
language	eng
recordid	cdi_proquest_miscellaneous_2644947510
source	Oxford Journals Open Access Collection
subjects	Algorithms Biological activity Datasets Drug development G protein-coupled receptors Learning Ligands Molecular structure Multiple criterion Screening Smoothness Training
title	AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T08%3A09%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AFSE:%20towards%20improving%20model%20generalization%20of%20deep%20graph%20learning%20of%20ligand%20bioactivities%20targeting%20GPCR%20proteins&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Yin,%20Yueming&rft.date=2022-05-13&rft.volume=23&rft.issue=3&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbac077&rft_dat=%3Cproquest_TOX%3E2644947510%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2675447562&rft_id=info:pmid/35348582&rft_oup_id=10.1093/bib/bbac077&rfr_iscdi=true