Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the impor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jiao, Xiaoran, Mao, Weian, Jin, Wengong, Yang, Peiyuan, Chen, Hao, Shen, Chunhua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Jiao, Xiaoran
Mao, Weian
Jin, Wengong
Yang, Peiyuan
Chen, Hao
Shen, Chunhua
description Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.
doi_str_mv 10.48550/arxiv.2410.09543
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_09543</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_09543</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_095433</originalsourceid><addsrcrecordid>eNqFjsEKgkAURWfTIqoPaNX7Ac1SoZYVSi2EFu3l4TxlYHwTM5NUX5-K-1YHLofLEWK9i8LkkKbRFu1bdeE-6YfomCbxXOiz0f7bInNw0qphknDjjqwjyI2WihsojCQN6ADhbkmqyhsLpobi5dErw6ghq2uqvAPDvWI8KQ4m9m-eLFaD6JZiVqN2tJq4EJs8e1yuwdhVPq1q0X7Koa8c--L_xg_coEbt</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><source>arXiv.org</source><creator>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</creator><creatorcontrib>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</creatorcontrib><description>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</description><identifier>DOI: 10.48550/arxiv.2410.09543</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computational Engineering, Finance, and Science ; Quantitative Biology - Biomolecules</subject><creationdate>2024-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.09543$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.09543$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Jiao, Xiaoran</creatorcontrib><creatorcontrib>Mao, Weian</creatorcontrib><creatorcontrib>Jin, Wengong</creatorcontrib><creatorcontrib>Yang, Peiyuan</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Shen, Chunhua</creatorcontrib><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><description>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computational Engineering, Finance, and Science</subject><subject>Quantitative Biology - Biomolecules</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjsEKgkAURWfTIqoPaNX7Ac1SoZYVSi2EFu3l4TxlYHwTM5NUX5-K-1YHLofLEWK9i8LkkKbRFu1bdeE-6YfomCbxXOiz0f7bInNw0qphknDjjqwjyI2WihsojCQN6ADhbkmqyhsLpobi5dErw6ghq2uqvAPDvWI8KQ4m9m-eLFaD6JZiVqN2tJq4EJs8e1yuwdhVPq1q0X7Koa8c--L_xg_coEbt</recordid><startdate>20241012</startdate><enddate>20241012</enddate><creator>Jiao, Xiaoran</creator><creator>Mao, Weian</creator><creator>Jin, Wengong</creator><creator>Yang, Peiyuan</creator><creator>Chen, Hao</creator><creator>Shen, Chunhua</creator><scope>AKY</scope><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20241012</creationdate><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><author>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_095433</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computational Engineering, Finance, and Science</topic><topic>Quantitative Biology - Biomolecules</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiao, Xiaoran</creatorcontrib><creatorcontrib>Mao, Weian</creatorcontrib><creatorcontrib>Jin, Wengong</creatorcontrib><creatorcontrib>Yang, Peiyuan</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Shen, Chunhua</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jiao, Xiaoran</au><au>Mao, Weian</au><au>Jin, Wengong</au><au>Yang, Peiyuan</au><au>Chen, Hao</au><au>Shen, Chunhua</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</atitle><date>2024-10-12</date><risdate>2024</risdate><abstract>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</abstract><doi>10.48550/arxiv.2410.09543</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.09543
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_09543
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computational Engineering, Finance, and Science
Quantitative Biology - Biomolecules
title Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T13%3A06%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Boltzmann-Aligned%20Inverse%20Folding%20Model%20as%20a%20Predictor%20of%20Mutational%20Effects%20on%20Protein-Protein%20Interactions&rft.au=Jiao,%20Xiaoran&rft.date=2024-10-12&rft_id=info:doi/10.48550/arxiv.2410.09543&rft_dat=%3Carxiv_GOX%3E2410_09543%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true