EvoRator2: Predicting Site-specific Amino Acid Substitutions Based on Protein Structural Information Using Deep Learning

[Display omitted] •Predicting amino acid substitutions aids multiple and diverse applications in biomedicine, including rational drug design, protein engineering, and identification of pathogenic missense mutations.•EvoRator2 exploits deep learning and protein structural information to predict per-s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of molecular biology 2023-07, Vol.435 (14), p.168155-168155, Article 168155
Hauptverfasser: Nagar, Natan, Tubiana, Jérôme, Loewenthal, Gil, Wolfson, Haim J., Ben Tal, Nir, Pupko, Tal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •Predicting amino acid substitutions aids multiple and diverse applications in biomedicine, including rational drug design, protein engineering, and identification of pathogenic missense mutations.•EvoRator2 exploits deep learning and protein structural information to predict per-site sets of tolerated amino acids.•EvoRator2 extracts diverse features from protein structures at different scales, from atoms to amino acids.•EvoRator2 can analyze proteins for which only few or no homologous proteins can be found, e.g., for orphan and de novo designed proteins.•EvoRator2 achieves near-state-of-the-art performance for the prediction of the effect of mutations in deep mutation scanning experiments. Multiple sequence alignments (MSAs) are the workhorse of molecular evolution and structural biology research. From MSAs, the amino acids that are tolerated at each site during protein evolution can be inferred. However, little is known regarding the repertoire of tolerated amino acids in proteins when only a few or no sequence homologs are available, such as orphan and de novo designed proteins. Here we present EvoRator2, a deep-learning algorithm trained on over 15,000 protein structures that can predict which amino acids are tolerated at any given site, based exclusively on protein structural information mined from atomic coordinate files. We show that EvoRator2 obtained satisfying results for the prediction of position-weighted scoring matrices (PSSM). We further show that EvoRator2 obtained near state-of-the-art performance on proteins with high quality structures in predicting the effect of mutations in deep mutation scanning (DMS) experiments and that for certain DMS targets, EvoRator2 outperformed state-of-the-art methods. We also show that by combining EvoRator2′s predictions with those obtained by a state-of-the-art deep-learning method that accounts for the information in the MSA, the prediction of the effect of mutation in DMS experiments was improved in terms of both accuracy and stability. EvoRator2 is designed to predict which amino-acid substitutions are tolerated in such proteins without many homologous sequences, including orphan or de novo designed proteins. We implemented our approach in the EvoRator web server (https://evorator.tau.ac.il).
ISSN:0022-2836
1089-8638
DOI:10.1016/j.jmb.2023.168155