DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information

Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsical...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of biological macromolecules 2023-12, Vol.253, p.127390-127390, Article 127390
Hauptverfasser: Yang, Zexi, Wang, Yan, Ni, Xinye, Yang, Sen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 127390
container_issue
container_start_page 127390
container_title International journal of biological macromolecules
container_volume 253
creator Yang, Zexi
Wang, Yan
Ni, Xinye
Yang, Sen
description Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP. •The combined use of integrated-view features can enhance the prediction of intrinsically disordered regions.•The transformer-enhanced embeddings are of paramount importance to the IDP prediction.•The TimeDistributed strategy enables information propagation across time steps which is useful for IDP prediction.•DeepDRP utilizes Bi-LSTM to capture the bidirectional contextual information of proteins.
doi_str_mv 10.1016/j.ijbiomac.2023.127390
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2877393018</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0141813023042873</els_id><sourcerecordid>2877393018</sourcerecordid><originalsourceid>FETCH-LOGICAL-c345t-3b187b4df059629dd2f2bb154a9153863c2a77665440cfd99ce97a302c752fc13</originalsourceid><addsrcrecordid>eNqFUcuOEzEQtBBIhIVfQD5ymeDHPDmBdnlJK7FCcLY8djvb0Ywd2g5oP4Z_xaPAmZPV7qpSVxVjL6XYSyH718c9HmdMq3V7JZTeSzXoSTxiOzkOUyOE0I_ZTshWNqPU4il7lvOx_vadHHfs9w3A6ebr3Rt-R-DRFUyRp8AxFsKY0dlleeAecyIPFcEJDhWS-WxznSq4IuFAttTpJ8Iv7qsgX8BSxHjgltw9FnDlTMADpZUXsjGHRCtQA_HeRleZNnp-olQAN8Fta7dLnrMnwS4ZXvx9r9j3D--_XX9qbr98_Hz97rZxuu1Ko-dqdW59EN3Uq8l7FdQ8y661k-z02Gun7DD0fde2wgU_TQ6mwWqh3NCp4KS-Yq8uuvWGH2fIxayYHSyLjZDO2ahxqJlqIccK7S9QRylngmBOhKulByOF2fowR_OvD7P1YS59VOLbCxGqkZoUmewQNvdINR_jE_5P4g8i85sK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2877393018</pqid></control><display><type>article</type><title>DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information</title><source>Access via ScienceDirect (Elsevier)</source><creator>Yang, Zexi ; Wang, Yan ; Ni, Xinye ; Yang, Sen</creator><creatorcontrib>Yang, Zexi ; Wang, Yan ; Ni, Xinye ; Yang, Sen</creatorcontrib><description>Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP. •The combined use of integrated-view features can enhance the prediction of intrinsically disordered regions.•The transformer-enhanced embeddings are of paramount importance to the IDP prediction.•The TimeDistributed strategy enables information propagation across time steps which is useful for IDP prediction.•DeepDRP utilizes Bi-LSTM to capture the bidirectional contextual information of proteins.</description><identifier>ISSN: 0141-8130</identifier><identifier>EISSN: 1879-0003</identifier><identifier>DOI: 10.1016/j.ijbiomac.2023.127390</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Intrinsically disordered proteins prediction ; TimeDistributed strategy ; Transformer-enhanced features</subject><ispartof>International journal of biological macromolecules, 2023-12, Vol.253, p.127390-127390, Article 127390</ispartof><rights>2023 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c345t-3b187b4df059629dd2f2bb154a9153863c2a77665440cfd99ce97a302c752fc13</citedby><cites>FETCH-LOGICAL-c345t-3b187b4df059629dd2f2bb154a9153863c2a77665440cfd99ce97a302c752fc13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ijbiomac.2023.127390$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Yang, Zexi</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Ni, Xinye</creatorcontrib><creatorcontrib>Yang, Sen</creatorcontrib><title>DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information</title><title>International journal of biological macromolecules</title><description>Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP. •The combined use of integrated-view features can enhance the prediction of intrinsically disordered regions.•The transformer-enhanced embeddings are of paramount importance to the IDP prediction.•The TimeDistributed strategy enables information propagation across time steps which is useful for IDP prediction.•DeepDRP utilizes Bi-LSTM to capture the bidirectional contextual information of proteins.</description><subject>Intrinsically disordered proteins prediction</subject><subject>TimeDistributed strategy</subject><subject>Transformer-enhanced features</subject><issn>0141-8130</issn><issn>1879-0003</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNqFUcuOEzEQtBBIhIVfQD5ymeDHPDmBdnlJK7FCcLY8djvb0Ywd2g5oP4Z_xaPAmZPV7qpSVxVjL6XYSyH718c9HmdMq3V7JZTeSzXoSTxiOzkOUyOE0I_ZTshWNqPU4il7lvOx_vadHHfs9w3A6ebr3Rt-R-DRFUyRp8AxFsKY0dlleeAecyIPFcEJDhWS-WxznSq4IuFAttTpJ8Iv7qsgX8BSxHjgltw9FnDlTMADpZUXsjGHRCtQA_HeRleZNnp-olQAN8Fta7dLnrMnwS4ZXvx9r9j3D--_XX9qbr98_Hz97rZxuu1Ko-dqdW59EN3Uq8l7FdQ8y661k-z02Gun7DD0fde2wgU_TQ6mwWqh3NCp4KS-Yq8uuvWGH2fIxayYHSyLjZDO2ahxqJlqIccK7S9QRylngmBOhKulByOF2fowR_OvD7P1YS59VOLbCxGqkZoUmewQNvdINR_jE_5P4g8i85sK</recordid><startdate>20231231</startdate><enddate>20231231</enddate><creator>Yang, Zexi</creator><creator>Wang, Yan</creator><creator>Ni, Xinye</creator><creator>Yang, Sen</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20231231</creationdate><title>DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information</title><author>Yang, Zexi ; Wang, Yan ; Ni, Xinye ; Yang, Sen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c345t-3b187b4df059629dd2f2bb154a9153863c2a77665440cfd99ce97a302c752fc13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Intrinsically disordered proteins prediction</topic><topic>TimeDistributed strategy</topic><topic>Transformer-enhanced features</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Zexi</creatorcontrib><creatorcontrib>Wang, Yan</creatorcontrib><creatorcontrib>Ni, Xinye</creatorcontrib><creatorcontrib>Yang, Sen</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>International journal of biological macromolecules</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Zexi</au><au>Wang, Yan</au><au>Ni, Xinye</au><au>Yang, Sen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information</atitle><jtitle>International journal of biological macromolecules</jtitle><date>2023-12-31</date><risdate>2023</risdate><volume>253</volume><spage>127390</spage><epage>127390</epage><pages>127390-127390</pages><artnum>127390</artnum><issn>0141-8130</issn><eissn>1879-0003</eissn><abstract>Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP. •The combined use of integrated-view features can enhance the prediction of intrinsically disordered regions.•The transformer-enhanced embeddings are of paramount importance to the IDP prediction.•The TimeDistributed strategy enables information propagation across time steps which is useful for IDP prediction.•DeepDRP utilizes Bi-LSTM to capture the bidirectional contextual information of proteins.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.ijbiomac.2023.127390</doi><tpages>1</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0141-8130
ispartof International journal of biological macromolecules, 2023-12, Vol.253, p.127390-127390, Article 127390
issn 0141-8130
1879-0003
language eng
recordid cdi_proquest_miscellaneous_2877393018
source Access via ScienceDirect (Elsevier)
subjects Intrinsically disordered proteins prediction
TimeDistributed strategy
Transformer-enhanced features
title DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T09%3A50%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DeepDRP:%20Prediction%20of%20intrinsically%20disordered%20regions%20based%20on%20integrated%20view%20deep%20learning%20architecture%20from%20transformer-enhanced%20and%20protein%20information&rft.jtitle=International%20journal%20of%20biological%20macromolecules&rft.au=Yang,%20Zexi&rft.date=2023-12-31&rft.volume=253&rft.spage=127390&rft.epage=127390&rft.pages=127390-127390&rft.artnum=127390&rft.issn=0141-8130&rft.eissn=1879-0003&rft_id=info:doi/10.1016/j.ijbiomac.2023.127390&rft_dat=%3Cproquest_cross%3E2877393018%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2877393018&rft_id=info:pmid/&rft_els_id=S0141813023042873&rfr_iscdi=true