Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites

Annotating active sites in enzymes is crucial for advancing multiple fields including drug discovery, disease research, enzyme engineering, and synthetic biology. Despite the development of numerous automated annotation algorithms, a significant trade-off between speed and accuracy limits their larg...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2024-08, Vol.15 (1), p.7348-20
Hauptverfasser: Wang, Xiaorui, Yin, Xiaodan, Jiang, Dejun, Zhao, Huifeng, Wu, Zhenxing, Zhang, Odin, Wang, Jike, Li, Yuquan, Deng, Yafeng, Liu, Huanxiang, Luo, Pei, Han, Yuqiang, Hou, Tingjun, Yao, Xiaojun, Hsieh, Chang-Yu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Annotating active sites in enzymes is crucial for advancing multiple fields including drug discovery, disease research, enzyme engineering, and synthetic biology. Despite the development of numerous automated annotation algorithms, a significant trade-off between speed and accuracy limits their large-scale practical applications. We introduce EasIFA, an enzyme active site annotation algorithm that fuses latent enzyme representations from the Protein Language Model and 3D structural encoder, and then aligns protein-level information with the knowledge of enzymatic reactions using a multi-modal cross-attention framework. EasIFA outperforms BLASTp with a 10-fold speed increase and improved recall, precision, f1 score, and MCC by 7.57%, 13.08%, 9.68%, and 0.1012, respectively. It also surpasses empirical-rule-based algorithm and other state-of-the-art deep learning annotation method based on PSSM features, achieving a speed increase ranging from 650 to 1400 times while enhancing annotation quality. This makes EasIFA a suitable replacement for conventional tools in both industrial and academic settings. EasIFA can also effectively transfer knowledge gained from coarsely annotated enzyme databases to smaller, high-precision datasets, highlighting its ability to model sparse and high-quality databases. Additionally, EasIFA shows potential as a catalytic site monitoring tool for designing enzymes with desired functions beyond their natural distribution. Wang et al. propose EasIFA, an efficient enzyme active site annotation algorithm, to advance various fields including drug discovery, disease research, enzyme engineering, and synthetic biology.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-024-51511-6