Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the impor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Jiao, Xiaoran, Mao, Weian, Jin, Wengong, Yang, Peiyuan, Chen, Hao, Shen, Chunhua
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computational Engineering, Finance, and Science Quantitative Biology - Biomolecules
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Jiao, Xiaoran Mao, Weian Jin, Wengong Yang, Peiyuan Chen, Hao Shen, Chunhua
description	Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.
doi_str_mv	10.48550/arxiv.2410.09543
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_09543</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_09543</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_095433</originalsourceid><addsrcrecordid>eNqFjsEKgkAURWfTIqoPaNX7Ac1SoZYVSi2EFu3l4TxlYHwTM5NUX5-K-1YHLofLEWK9i8LkkKbRFu1bdeE-6YfomCbxXOiz0f7bInNw0qphknDjjqwjyI2WihsojCQN6ADhbkmqyhsLpobi5dErw6ghq2uqvAPDvWI8KQ4m9m-eLFaD6JZiVqN2tJq4EJs8e1yuwdhVPq1q0X7Koa8c--L_xg_coEbt</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><source>arXiv.org</source><creator>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</creator><creatorcontrib>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</creatorcontrib><description>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</description><identifier>DOI: 10.48550/arxiv.2410.09543</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computational Engineering, Finance, and Science ; Quantitative Biology - Biomolecules</subject><creationdate>2024-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.09543$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.09543$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Jiao, Xiaoran</creatorcontrib><creatorcontrib>Mao, Weian</creatorcontrib><creatorcontrib>Jin, Wengong</creatorcontrib><creatorcontrib>Yang, Peiyuan</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Shen, Chunhua</creatorcontrib><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><description>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computational Engineering, Finance, and Science</subject><subject>Quantitative Biology - Biomolecules</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjsEKgkAURWfTIqoPaNX7Ac1SoZYVSi2EFu3l4TxlYHwTM5NUX5-K-1YHLofLEWK9i8LkkKbRFu1bdeE-6YfomCbxXOiz0f7bInNw0qphknDjjqwjyI2WihsojCQN6ADhbkmqyhsLpobi5dErw6ghq2uqvAPDvWI8KQ4m9m-eLFaD6JZiVqN2tJq4EJs8e1yuwdhVPq1q0X7Koa8c--L_xg_coEbt</recordid><startdate>20241012</startdate><enddate>20241012</enddate><creator>Jiao, Xiaoran</creator><creator>Mao, Weian</creator><creator>Jin, Wengong</creator><creator>Yang, Peiyuan</creator><creator>Chen, Hao</creator><creator>Shen, Chunhua</creator><scope>AKY</scope><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20241012</creationdate><title>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</title><author>Jiao, Xiaoran ; Mao, Weian ; Jin, Wengong ; Yang, Peiyuan ; Chen, Hao ; Shen, Chunhua</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_095433</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computational Engineering, Finance, and Science</topic><topic>Quantitative Biology - Biomolecules</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiao, Xiaoran</creatorcontrib><creatorcontrib>Mao, Weian</creatorcontrib><creatorcontrib>Jin, Wengong</creatorcontrib><creatorcontrib>Yang, Peiyuan</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Shen, Chunhua</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jiao, Xiaoran</au><au>Mao, Weian</au><au>Jin, Wengong</au><au>Yang, Peiyuan</au><au>Chen, Hao</au><au>Shen, Chunhua</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions</atitle><date>2024-10-12</date><risdate>2024</risdate><abstract>Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.</abstract><doi>10.48550/arxiv.2410.09543</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2410.09543
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2410_09543
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computational Engineering, Finance, and Science Quantitative Biology - Biomolecules
title	Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T13%3A06%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Boltzmann-Aligned%20Inverse%20Folding%20Model%20as%20a%20Predictor%20of%20Mutational%20Effects%20on%20Protein-Protein%20Interactions&rft.au=Jiao,%20Xiaoran&rft.date=2024-10-12&rft_id=info:doi/10.48550/arxiv.2410.09543&rft_dat=%3Carxiv_GOX%3E2410_09543%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true