PFmulDL: a novel strategy enabling multi-class and multi-label protein function annotation by integrating diverse deep learning methods

Bioinformatic annotation of protein function is essential but extremely sophisticated, which asks for extensive efforts to develop effective prediction method. However, the existing methods tend to amplify the representativeness of the families with large number of proteins by misclassifying the pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers in biology and medicine 2022-06, Vol.145, p.105465-105465, Article 105465
Hauptverfasser: Xia, Weiqi, Zheng, Lingyan, Fang, Jiebin, Li, Fengcheng, Zhou, Ying, Zeng, Zhenyu, Zhang, Bing, Li, Zhaorong, Li, Honglin, Zhu, Feng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Bioinformatic annotation of protein function is essential but extremely sophisticated, which asks for extensive efforts to develop effective prediction method. However, the existing methods tend to amplify the representativeness of the families with large number of proteins by misclassifying the proteins in the families with small number of proteins. That is to say, the ability of the existing methods to annotate proteins in the ‘rare classes’ remains limited. Herein, a new protein function annotation strategy, PFmulDL, integrating multiple deep learning methods, was thus constructed. First, the recurrent neural network was integrated, for the first time, with the convolutional neural network to facilitate the function annotation. Second, a transfer learning method was introduced to the model construction for further improving the prediction performances. Third, based on the latest data of Gene Ontology, the newly constructed model could annotate the largest number of protein families comparing with the existing methods. Finally, this newly constructed model was found capable of significantly elevating the prediction performance for the ‘rare classes’ without sacrificing that for the ‘major classes’. All in all, due to the emerging requirements on improving the prediction performance for the proteins in ‘rare classes’, this new strategy would become an essential complement to the existing methods for protein function prediction. All the models and source codes are freely available and open to all users at: https://github.com/idrblab/PFmulDL. •A novel protein function annotation strategy was constructed by integrating a recurrent neural network method with the convolutional neural network.•A transfer learning method was introduced to the model construction for further improving the performances of protein function annotation.•This newly constructed model could annotate the largest number of protein families comparing with the existing methods.•This newly constructed model was found capable of significantly elevating the prediction performance for the ‘rare classes’ without sacrificing that for the ‘major classes’.
ISSN:0010-4825
1879-0534
DOI:10.1016/j.compbiomed.2022.105465