Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the impor...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Predicting the change in binding free energy ($\Delta \Delta G$) is crucial
for understanding and modulating protein-protein interactions, which are
critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$
data, existing methods focus on pre-training, while neglecting the importance
of alignment. In this work, we propose the Boltzmann Alignment technique to
transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$
prediction. We begin by analyzing the thermodynamic definition of $\Delta
\Delta G$ and introducing the Boltzmann distribution to connect energy with
protein conformational distribution. However, the protein conformational
distribution is intractable; therefore, we employ Bayes' theorem to circumvent
direct estimation and instead utilize the log-likelihood provided by protein
inverse folding models for $\Delta \Delta G$ estimation. Compared to previous
inverse folding-based methods, our method explicitly accounts for the unbound
state of protein complex in the $\Delta \Delta G$ thermodynamic cycle,
introducing a physical inductive bias and achieving both supervised and
unsupervised state-of-the-art (SoTA) performance. Experimental results on
SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201
(unsupervised) and 0.5134 (supervised), significantly surpassing the previously
reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we
demonstrate the capability of our method on binding energy prediction,
protein-protein docking and antibody optimization tasks. |
---|---|
DOI: | 10.48550/arxiv.2410.09543 |