Recurrent neural network-based prediction of O-GlcNAcylation sites in mammalian proteins
O-GlcNAcylation has the potential to be an important target for therapeutics, but a motif or an algorithm to reliably predict O-GlcNAcylation sites is not available. Current predictive models are insufficient as they fail to generalize, and many are no longer available. This article constructs recur...
Gespeichert in:
Veröffentlicht in: | Computers & chemical engineering 2024-10, Vol.189, p.108818, Article 108818 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | O-GlcNAcylation has the potential to be an important target for therapeutics, but a motif or an algorithm to reliably predict O-GlcNAcylation sites is not available. Current predictive models are insufficient as they fail to generalize, and many are no longer available. This article constructs recurrent neural network models to predict O-GlcNAcylation sites based on protein sequences. Different datasets are evaluated separately and assessed in terms of strengths and issues. Within a given dataset, results are robust to changes in cross-validation and test data as determined by nested validation. The best model achieves an F1 score of 36% (more than 3.5-fold greater than the previous best model) and a Matthews Correlation Coefficient of 35% (more than 4.5-fold greater than the previous best model), and, for the F1 score, 7.6-fold higher than when not using any model. Shapley values are used to interpret the model’s predictions and provide biological insight into O-GlcNAcylation.
•O-GlcNAcylation has the potential to be an important target for therapeutics.•Recurrent neural networks predict O-GlcNAcylation sites based on protein sequences.•The model achieves 3.5-fold improvement in the F1 score over published models.•The model achieves 4.5-fold improvement in the Matthews Correlation Coefficient.•Shapley coefficients provide interpretability and insight on the model predictions. |
---|---|
ISSN: | 0098-1354 |
DOI: | 10.1016/j.compchemeng.2024.108818 |