GNNGO3D: Protein Function Prediction Based on 3D Structure and Functional Hierarchy Learning

Protein sequences accumulate in large quantities, and the traditional method of annotating protein function by experiment has been unable to bridge the gap between annotated proteins and unannotated proteins. Machine learning-based protein function prediction is an effective approach to solve this p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2024-08, Vol.36 (8), p.3867-3878
Hauptverfasser: Zhang, Liyuan, Jiang, Yongquan, Yang, Yan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Protein sequences accumulate in large quantities, and the traditional method of annotating protein function by experiment has been unable to bridge the gap between annotated proteins and unannotated proteins. Machine learning-based protein function prediction is an effective approach to solve this problem. Most of the existing methods only use the protein sequence but ignore the three-dimensional structure which is closely related to the protein function. And the hierarchy of protein functions is not adequately considered. To solve this problem, we propose a graph neural network (GNNGO3D) that combines the three-dimensional structure and functional hierarchy learning. GNNGO3D simultaneously uses three kinds of information: protein sequence, tertiary structure, and hierarchical relationship of protein function to predict protein function. The novelty of GNNGO3D lies in that it integrates the learning of functional level information into the method of predicting protein function by using tertiary structure information, fully learning the relationship between protein functions, and helping to better predict protein function. Experimental results show that our method is superior to existing methods for predicting protein function based on sequence and structure.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2023.3331005