LectinOracle: A Generalizable Deep Learning Model for Lectin–Glycan Binding Prediction

Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan‐binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a foca...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Advanced science 2022-01, Vol.9 (1), p.e2103807-n/a
Hauptverfasser: Lundstrøm, Jon, Korhonen, Emma, Lisacek, Frédérique, Bojar, Daniel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan‐binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a focal point of study in glycobiology. Yet the sheer breadth and depth of specificity for diverse oligosaccharide motifs has made studying lectins a largely piecemeal approach, with few options to generalize. Here, LectinOracle, a model combining transformer‐based representations for proteins and graph convolutional neural networks for glycans to predict their interaction, is presented. Using a curated data set of 564,647 unique protein–glycan interactions, it is shown that LectinOracle predictions agree with literature‐annotated specificities for a wide range of lectins. Using a range of specialized glycan arrays, it is shown that LectinOracle predictions generalize to new glycans and lectins, with qualitative and quantitative agreement with experimental data. It is further demonstrated that LectinOracle can be used to improve lectin classification, accelerate lectin directed evolution, predict epidemiological outcomes in the context of influenza virus, and analyze whole lectomes in host–microbe interactions. It is envisioned that the herein presented platform will advance both the study of lectins and their role in (glyco)biology. Using a large, curated data set of protein–carbohydrate interactions, a new deep learning model, LectinOracle, is introduced. LectinOracle predicts protein–carbohydrate interactions from their sequences and generalizes to new proteins, carbohydrates, and contexts. Applying LectinOracle to the microbiome and virus epidemics demonstrates its utility for analyzing carbohydrate‐binding proteins in health and disease.
ISSN:2198-3844
2198-3844
DOI:10.1002/advs.202103807