Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining

We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on te...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Duschatko, Blake R, Fu, Xiang, Owen, Cameron, Xie, Yu, Musaelian, Albert, Jaakkola, Tommi, Kozinsky, Boris
Format:	Artikel
Sprache:	eng
Schlagworte:	Physics - Chemical Physics Physics - Computational Physics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Duschatko, Blake R Fu, Xiang Owen, Cameron Xie, Yu Musaelian, Albert Jaakkola, Tommi Kozinsky, Boris
description	We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.
doi_str_mv	10.48550/arxiv.2405.19386
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2405_19386</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2405_19386</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-9af4c32f16b669d85fe9c40db6f6b52a8c86fa386930b337f2db02a01eb7e9cf3</originalsourceid><addsrcrecordid>eNotj81OhDAUhdm4MKMP4Mq-AFgoFFganL-EyWzYk1t6yzQprbk4Rt5eZnR1TnJOvuSLopeUJ3lVFPwN6Md-J1nOiyStRSUfI-ouSFPQi4fJDuDcwo7eBJpQs9PVfdl1A8daBPLWjywYdrDjJf6wE_rZBr-OO0JkW480LuwUNLqZWb82h8PVAbEmAM3I9gT2xniKHgy4GZ__cxN1u23XHOL2vD82720MspRxDSYfRGZSqaSsdVUYrIecayWNVEUG1VBJA6tDLbgSojSZVjwDnqIq16cRm-j1D3uX7j_JTkBLf5Pv7_LiFyB_VvQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><source>arXiv.org</source><creator>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</creator><creatorcontrib>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</creatorcontrib><description>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</description><identifier>DOI: 10.48550/arxiv.2405.19386</identifier><language>eng</language><subject>Physics - Chemical Physics ; Physics - Computational Physics</subject><creationdate>2024-05</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2405.19386$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2405.19386$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Duschatko, Blake R</creatorcontrib><creatorcontrib>Fu, Xiang</creatorcontrib><creatorcontrib>Owen, Cameron</creatorcontrib><creatorcontrib>Xie, Yu</creatorcontrib><creatorcontrib>Musaelian, Albert</creatorcontrib><creatorcontrib>Jaakkola, Tommi</creatorcontrib><creatorcontrib>Kozinsky, Boris</creatorcontrib><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><description>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</description><subject>Physics - Chemical Physics</subject><subject>Physics - Computational Physics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81OhDAUhdm4MKMP4Mq-AFgoFFganL-EyWzYk1t6yzQprbk4Rt5eZnR1TnJOvuSLopeUJ3lVFPwN6Md-J1nOiyStRSUfI-ouSFPQi4fJDuDcwo7eBJpQs9PVfdl1A8daBPLWjywYdrDjJf6wE_rZBr-OO0JkW480LuwUNLqZWb82h8PVAbEmAM3I9gT2xniKHgy4GZ__cxN1u23XHOL2vD82720MspRxDSYfRGZSqaSsdVUYrIecayWNVEUG1VBJA6tDLbgSojSZVjwDnqIq16cRm-j1D3uX7j_JTkBLf5Pv7_LiFyB_VvQ</recordid><startdate>20240529</startdate><enddate>20240529</enddate><creator>Duschatko, Blake R</creator><creator>Fu, Xiang</creator><creator>Owen, Cameron</creator><creator>Xie, Yu</creator><creator>Musaelian, Albert</creator><creator>Jaakkola, Tommi</creator><creator>Kozinsky, Boris</creator><scope>GOX</scope></search><sort><creationdate>20240529</creationdate><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><author>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-9af4c32f16b669d85fe9c40db6f6b52a8c86fa386930b337f2db02a01eb7e9cf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Physics - Chemical Physics</topic><topic>Physics - Computational Physics</topic><toplevel>online_resources</toplevel><creatorcontrib>Duschatko, Blake R</creatorcontrib><creatorcontrib>Fu, Xiang</creatorcontrib><creatorcontrib>Owen, Cameron</creatorcontrib><creatorcontrib>Xie, Yu</creatorcontrib><creatorcontrib>Musaelian, Albert</creatorcontrib><creatorcontrib>Jaakkola, Tommi</creatorcontrib><creatorcontrib>Kozinsky, Boris</creatorcontrib><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Duschatko, Blake R</au><au>Fu, Xiang</au><au>Owen, Cameron</au><au>Xie, Yu</au><au>Musaelian, Albert</au><au>Jaakkola, Tommi</au><au>Kozinsky, Boris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</atitle><date>2024-05-29</date><risdate>2024</risdate><abstract>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</abstract><doi>10.48550/arxiv.2405.19386</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2405.19386
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2405_19386
source	arXiv.org
subjects	Physics - Chemical Physics Physics - Computational Physics
title	Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T15%3A58%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Thermodynamically%20Informed%20Multimodal%20Learning%20of%20High-Dimensional%20Free%20Energy%20Models%20in%20Molecular%20Coarse%20Graining&rft.au=Duschatko,%20Blake%20R&rft.date=2024-05-29&rft_id=info:doi/10.48550/arxiv.2405.19386&rft_dat=%3Carxiv_GOX%3E2405_19386%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true