Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations

The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on computational biology and bioinformatics 2022-07, Vol.19 (4), p.2092-2110
Hauptverfasser:	Zhao, Qichang, Yang, Mengyun, Cheng, Zhongjian, Li, Yaohang, Wang, Jianxin
Format:	Artikel
Sprache:	eng
Schlagworte:	Biomedical data compound-protein relation prediction Compounds Computer applications Computer vision Datasets Deep learning Drug development Drugs Failure rates Mathematical models Natural language processing Prediction models Protein interaction Proteins Screening Task analysis Three-dimensional displays Virtual screening
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2110
container_issue	4
container_start_page	2092
container_title	IEEE/ACM transactions on computational biology and bioinformatics
container_volume	19
creator	Zhao, Qichang Yang, Mengyun Cheng, Zhongjian Li, Yaohang Wang, Jianxin
description	The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, the number of compounds and proteins is massive, and in vitro screening experiments are labor-intensive, expensive, and time-consuming with high failure rates. Researchers have developed a computational field called virtual screening (VS) to aid experimental drug development. These methods utilize experimentally validated biological interaction information to generate datasets and use the physicochemical and structural properties of compounds and target proteins as input information to train computational prediction models. At present, deep learning has been widely used in computer vision and natural language processing and has experienced epoch-making progress. At the same time, deep learning has also been used in the field of biomedicine widely, and the prediction of CPRs based on deep learning has developed rapidly and has achieved good results. The purpose of this study is to investigate and discuss the latest applications of deep learning techniques in CPR prediction. First, we describe the datasets and feature engineering (i.e., compound and protein representations and descriptors) commonly used in CPR prediction methods. Then, we review and classify recent deep learning approaches in CPR prediction. Next, a comprehensive comparison is performed to demonstrate the prediction performance of representative methods on classical datasets. Finally, we discuss the current state of the field, including the existing challenges and our proposed future directions. We believe that this investigation will provide sufficient references and insight for researchers to understand and develop new deep learning methods to enhance CPR predictions.
doi_str_mv	10.1109/TCBB.2021.3069040
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2506275144</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9387544</ieee_id><sourcerecordid>2700414430</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-55fdeed6e45406d8aa4e197febc070bcc6ad32eadcbd1e44036696bb2bc7edd93</originalsourceid><addsrcrecordid>eNpd0MtOwzAQBVALgXh_AEJCkdiwSRnHj9RL2vKSikAI1sGxJyhVGhc7WfD3OLRlwcqWfO5Ycwk5ozCiFNT123QyGWWQ0REDqYDDDjmkQuSpUpLvDncuUqEkOyBHISwAMh7VPjlgLJdKMXFIPia1W6KtjW6Sme50olubzBBXyRy1b-v2M5m65arvdFe7NqInZ7EJSeV88uKHYLc1rm9t-uJdh3WbvGLzmwgnZK_STcDTzXlM3u9u36YP6fz5_nF6M08N46pLhagsopXIBQdpx1pzpCqvsDSQQ2mM1JZlqK0pLUXOgUmpZFlmpcnRWsWOydV67sq7rx5DVyzrYLBpdIuuD0UmQGa5oJxHevmPLlzv43JR5QB8MBAVXSvjXQgeq2Ll66X23wWFYqi_GOovhvqLTf0xc7GZ3Jex1b_Etu8IztegRsS_Z8XGuYif_gCWq4oC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700414430</pqid></control><display><type>article</type><title>Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations</title><source>IEEE Electronic Library (IEL)</source><creator>Zhao, Qichang ; Yang, Mengyun ; Cheng, Zhongjian ; Li, Yaohang ; Wang, Jianxin</creator><creatorcontrib>Zhao, Qichang ; Yang, Mengyun ; Cheng, Zhongjian ; Li, Yaohang ; Wang, Jianxin</creatorcontrib><description>The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, the number of compounds and proteins is massive, and in vitro screening experiments are labor-intensive, expensive, and time-consuming with high failure rates. Researchers have developed a computational field called virtual screening (VS) to aid experimental drug development. These methods utilize experimentally validated biological interaction information to generate datasets and use the physicochemical and structural properties of compounds and target proteins as input information to train computational prediction models. At present, deep learning has been widely used in computer vision and natural language processing and has experienced epoch-making progress. At the same time, deep learning has also been used in the field of biomedicine widely, and the prediction of CPRs based on deep learning has developed rapidly and has achieved good results. The purpose of this study is to investigate and discuss the latest applications of deep learning techniques in CPR prediction. First, we describe the datasets and feature engineering (i.e., compound and protein representations and descriptors) commonly used in CPR prediction methods. Then, we review and classify recent deep learning approaches in CPR prediction. Next, a comprehensive comparison is performed to demonstrate the prediction performance of representative methods on classical datasets. Finally, we discuss the current state of the field, including the existing challenges and our proposed future directions. We believe that this investigation will provide sufficient references and insight for researchers to understand and develop new deep learning methods to enhance CPR predictions.</description><identifier>ISSN: 1545-5963</identifier><identifier>EISSN: 1557-9964</identifier><identifier>DOI: 10.1109/TCBB.2021.3069040</identifier><identifier>PMID: 33769935</identifier><identifier>CODEN: ITCBCY</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Biomedical data ; compound-protein relation prediction ; Compounds ; Computer applications ; Computer vision ; Datasets ; Deep learning ; Drug development ; Drugs ; Failure rates ; Mathematical models ; Natural language processing ; Prediction models ; Protein interaction ; Proteins ; Screening ; Task analysis ; Three-dimensional displays ; Virtual screening</subject><ispartof>IEEE/ACM transactions on computational biology and bioinformatics, 2022-07, Vol.19 (4), p.2092-2110</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-55fdeed6e45406d8aa4e197febc070bcc6ad32eadcbd1e44036696bb2bc7edd93</citedby><cites>FETCH-LOGICAL-c349t-55fdeed6e45406d8aa4e197febc070bcc6ad32eadcbd1e44036696bb2bc7edd93</cites><orcidid>0000-0002-2703-533X ; 0000-0003-0178-1876 ; 0000-0003-1516-0480</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9387544$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9387544$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33769935$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Qichang</creatorcontrib><creatorcontrib>Yang, Mengyun</creatorcontrib><creatorcontrib>Cheng, Zhongjian</creatorcontrib><creatorcontrib>Li, Yaohang</creatorcontrib><creatorcontrib>Wang, Jianxin</creatorcontrib><title>Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations</title><title>IEEE/ACM transactions on computational biology and bioinformatics</title><addtitle>TCBB</addtitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><description>The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, the number of compounds and proteins is massive, and in vitro screening experiments are labor-intensive, expensive, and time-consuming with high failure rates. Researchers have developed a computational field called virtual screening (VS) to aid experimental drug development. These methods utilize experimentally validated biological interaction information to generate datasets and use the physicochemical and structural properties of compounds and target proteins as input information to train computational prediction models. At present, deep learning has been widely used in computer vision and natural language processing and has experienced epoch-making progress. At the same time, deep learning has also been used in the field of biomedicine widely, and the prediction of CPRs based on deep learning has developed rapidly and has achieved good results. The purpose of this study is to investigate and discuss the latest applications of deep learning techniques in CPR prediction. First, we describe the datasets and feature engineering (i.e., compound and protein representations and descriptors) commonly used in CPR prediction methods. Then, we review and classify recent deep learning approaches in CPR prediction. Next, a comprehensive comparison is performed to demonstrate the prediction performance of representative methods on classical datasets. Finally, we discuss the current state of the field, including the existing challenges and our proposed future directions. We believe that this investigation will provide sufficient references and insight for researchers to understand and develop new deep learning methods to enhance CPR predictions.</description><subject>Biomedical data</subject><subject>compound-protein relation prediction</subject><subject>Compounds</subject><subject>Computer applications</subject><subject>Computer vision</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Drug development</subject><subject>Drugs</subject><subject>Failure rates</subject><subject>Mathematical models</subject><subject>Natural language processing</subject><subject>Prediction models</subject><subject>Protein interaction</subject><subject>Proteins</subject><subject>Screening</subject><subject>Task analysis</subject><subject>Three-dimensional displays</subject><subject>Virtual screening</subject><issn>1545-5963</issn><issn>1557-9964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0MtOwzAQBVALgXh_AEJCkdiwSRnHj9RL2vKSikAI1sGxJyhVGhc7WfD3OLRlwcqWfO5Ycwk5ozCiFNT123QyGWWQ0REDqYDDDjmkQuSpUpLvDncuUqEkOyBHISwAMh7VPjlgLJdKMXFIPia1W6KtjW6Sme50olubzBBXyRy1b-v2M5m65arvdFe7NqInZ7EJSeV88uKHYLc1rm9t-uJdh3WbvGLzmwgnZK_STcDTzXlM3u9u36YP6fz5_nF6M08N46pLhagsopXIBQdpx1pzpCqvsDSQQ2mM1JZlqK0pLUXOgUmpZFlmpcnRWsWOydV67sq7rx5DVyzrYLBpdIuuD0UmQGa5oJxHevmPLlzv43JR5QB8MBAVXSvjXQgeq2Ll66X23wWFYqi_GOovhvqLTf0xc7GZ3Jex1b_Etu8IztegRsS_Z8XGuYif_gCWq4oC</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Zhao, Qichang</creator><creator>Yang, Mengyun</creator><creator>Cheng, Zhongjian</creator><creator>Li, Yaohang</creator><creator>Wang, Jianxin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-2703-533X</orcidid><orcidid>https://orcid.org/0000-0003-0178-1876</orcidid><orcidid>https://orcid.org/0000-0003-1516-0480</orcidid></search><sort><creationdate>20220701</creationdate><title>Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations</title><author>Zhao, Qichang ; Yang, Mengyun ; Cheng, Zhongjian ; Li, Yaohang ; Wang, Jianxin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-55fdeed6e45406d8aa4e197febc070bcc6ad32eadcbd1e44036696bb2bc7edd93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Biomedical data</topic><topic>compound-protein relation prediction</topic><topic>Compounds</topic><topic>Computer applications</topic><topic>Computer vision</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Drug development</topic><topic>Drugs</topic><topic>Failure rates</topic><topic>Mathematical models</topic><topic>Natural language processing</topic><topic>Prediction models</topic><topic>Protein interaction</topic><topic>Proteins</topic><topic>Screening</topic><topic>Task analysis</topic><topic>Three-dimensional displays</topic><topic>Virtual screening</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Qichang</creatorcontrib><creatorcontrib>Yang, Mengyun</creatorcontrib><creatorcontrib>Cheng, Zhongjian</creatorcontrib><creatorcontrib>Li, Yaohang</creatorcontrib><creatorcontrib>Wang, Jianxin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Qichang</au><au>Yang, Mengyun</au><au>Cheng, Zhongjian</au><au>Li, Yaohang</au><au>Wang, Jianxin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations</atitle><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle><stitle>TCBB</stitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><date>2022-07-01</date><risdate>2022</risdate><volume>19</volume><issue>4</issue><spage>2092</spage><epage>2110</epage><pages>2092-2110</pages><issn>1545-5963</issn><eissn>1557-9964</eissn><coden>ITCBCY</coden><abstract>The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, the number of compounds and proteins is massive, and in vitro screening experiments are labor-intensive, expensive, and time-consuming with high failure rates. Researchers have developed a computational field called virtual screening (VS) to aid experimental drug development. These methods utilize experimentally validated biological interaction information to generate datasets and use the physicochemical and structural properties of compounds and target proteins as input information to train computational prediction models. At present, deep learning has been widely used in computer vision and natural language processing and has experienced epoch-making progress. At the same time, deep learning has also been used in the field of biomedicine widely, and the prediction of CPRs based on deep learning has developed rapidly and has achieved good results. The purpose of this study is to investigate and discuss the latest applications of deep learning techniques in CPR prediction. First, we describe the datasets and feature engineering (i.e., compound and protein representations and descriptors) commonly used in CPR prediction methods. Then, we review and classify recent deep learning approaches in CPR prediction. Next, a comprehensive comparison is performed to demonstrate the prediction performance of representative methods on classical datasets. Finally, we discuss the current state of the field, including the existing challenges and our proposed future directions. We believe that this investigation will provide sufficient references and insight for researchers to understand and develop new deep learning methods to enhance CPR predictions.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33769935</pmid><doi>10.1109/TCBB.2021.3069040</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0002-2703-533X</orcidid><orcidid>https://orcid.org/0000-0003-0178-1876</orcidid><orcidid>https://orcid.org/0000-0003-1516-0480</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1545-5963
ispartof	IEEE/ACM transactions on computational biology and bioinformatics, 2022-07, Vol.19 (4), p.2092-2110
issn	1545-5963 1557-9964
language	eng
recordid	cdi_proquest_miscellaneous_2506275144
source	IEEE Electronic Library (IEL)
subjects	Biomedical data compound-protein relation prediction Compounds Computer applications Computer vision Datasets Deep learning Drug development Drugs Failure rates Mathematical models Natural language processing Prediction models Protein interaction Proteins Screening Task analysis Three-dimensional displays Virtual screening
title	Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T02%3A52%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Biomedical%20Data%20and%20Deep%20Learning%20Computational%20Models%20for%20Predicting%20Compound-Protein%20Relations&rft.jtitle=IEEE/ACM%20transactions%20on%20computational%20biology%20and%20bioinformatics&rft.au=Zhao,%20Qichang&rft.date=2022-07-01&rft.volume=19&rft.issue=4&rft.spage=2092&rft.epage=2110&rft.pages=2092-2110&rft.issn=1545-5963&rft.eissn=1557-9964&rft.coden=ITCBCY&rft_id=info:doi/10.1109/TCBB.2021.3069040&rft_dat=%3Cproquest_RIE%3E2700414430%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700414430&rft_id=info:pmid/33769935&rft_ieee_id=9387544&rfr_iscdi=true