Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining

We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on te...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Duschatko, Blake R, Fu, Xiang, Owen, Cameron, Xie, Yu, Musaelian, Albert, Jaakkola, Tommi, Kozinsky, Boris
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Duschatko, Blake R
Fu, Xiang
Owen, Cameron
Xie, Yu
Musaelian, Albert
Jaakkola, Tommi
Kozinsky, Boris
description We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.
doi_str_mv 10.48550/arxiv.2405.19386
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2405_19386</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2405_19386</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-9af4c32f16b669d85fe9c40db6f6b52a8c86fa386930b337f2db02a01eb7e9cf3</originalsourceid><addsrcrecordid>eNotj81OhDAUhdm4MKMP4Mq-AFgoFFganL-EyWzYk1t6yzQprbk4Rt5eZnR1TnJOvuSLopeUJ3lVFPwN6Md-J1nOiyStRSUfI-ouSFPQi4fJDuDcwo7eBJpQs9PVfdl1A8daBPLWjywYdrDjJf6wE_rZBr-OO0JkW480LuwUNLqZWb82h8PVAbEmAM3I9gT2xniKHgy4GZ__cxN1u23XHOL2vD82720MspRxDSYfRGZSqaSsdVUYrIecayWNVEUG1VBJA6tDLbgSojSZVjwDnqIq16cRm-j1D3uX7j_JTkBLf5Pv7_LiFyB_VvQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><source>arXiv.org</source><creator>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</creator><creatorcontrib>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</creatorcontrib><description>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</description><identifier>DOI: 10.48550/arxiv.2405.19386</identifier><language>eng</language><subject>Physics - Chemical Physics ; Physics - Computational Physics</subject><creationdate>2024-05</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2405.19386$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2405.19386$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Duschatko, Blake R</creatorcontrib><creatorcontrib>Fu, Xiang</creatorcontrib><creatorcontrib>Owen, Cameron</creatorcontrib><creatorcontrib>Xie, Yu</creatorcontrib><creatorcontrib>Musaelian, Albert</creatorcontrib><creatorcontrib>Jaakkola, Tommi</creatorcontrib><creatorcontrib>Kozinsky, Boris</creatorcontrib><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><description>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</description><subject>Physics - Chemical Physics</subject><subject>Physics - Computational Physics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81OhDAUhdm4MKMP4Mq-AFgoFFganL-EyWzYk1t6yzQprbk4Rt5eZnR1TnJOvuSLopeUJ3lVFPwN6Md-J1nOiyStRSUfI-ouSFPQi4fJDuDcwo7eBJpQs9PVfdl1A8daBPLWjywYdrDjJf6wE_rZBr-OO0JkW480LuwUNLqZWb82h8PVAbEmAM3I9gT2xniKHgy4GZ__cxN1u23XHOL2vD82720MspRxDSYfRGZSqaSsdVUYrIecayWNVEUG1VBJA6tDLbgSojSZVjwDnqIq16cRm-j1D3uX7j_JTkBLf5Pv7_LiFyB_VvQ</recordid><startdate>20240529</startdate><enddate>20240529</enddate><creator>Duschatko, Blake R</creator><creator>Fu, Xiang</creator><creator>Owen, Cameron</creator><creator>Xie, Yu</creator><creator>Musaelian, Albert</creator><creator>Jaakkola, Tommi</creator><creator>Kozinsky, Boris</creator><scope>GOX</scope></search><sort><creationdate>20240529</creationdate><title>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</title><author>Duschatko, Blake R ; Fu, Xiang ; Owen, Cameron ; Xie, Yu ; Musaelian, Albert ; Jaakkola, Tommi ; Kozinsky, Boris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-9af4c32f16b669d85fe9c40db6f6b52a8c86fa386930b337f2db02a01eb7e9cf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Physics - Chemical Physics</topic><topic>Physics - Computational Physics</topic><toplevel>online_resources</toplevel><creatorcontrib>Duschatko, Blake R</creatorcontrib><creatorcontrib>Fu, Xiang</creatorcontrib><creatorcontrib>Owen, Cameron</creatorcontrib><creatorcontrib>Xie, Yu</creatorcontrib><creatorcontrib>Musaelian, Albert</creatorcontrib><creatorcontrib>Jaakkola, Tommi</creatorcontrib><creatorcontrib>Kozinsky, Boris</creatorcontrib><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Duschatko, Blake R</au><au>Fu, Xiang</au><au>Owen, Cameron</au><au>Xie, Yu</au><au>Musaelian, Albert</au><au>Jaakkola, Tommi</au><au>Kozinsky, Boris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining</atitle><date>2024-05-29</date><risdate>2024</risdate><abstract>We present a differentiable formalism for learning free energies that is capable of capturing arbitrarily complex model dependencies on coarse-grained coordinates and finite-temperature response to variation of general system parameters. This is done by endowing models with explicit dependence on temperature and parameters and by exploiting exact differential thermodynamic relationships between the free energy, ensemble averages, and response properties. Formally, we derive an approach for learning high-dimensional cumulant generating functions using statistical estimates of their derivatives, which are observable cumulants of the underlying random variable. The proposed formalism opens ways to resolve several outstanding challenges in bottom-up molecular coarse graining dealing with multiple minima and state dependence. This is realized by using additional differential relationships in the loss function to significantly improve the learning of free energies, while exactly preserving the Boltzmann distribution governing the corresponding fine-grain all-atom system. As an example, we go beyond the standard force-matching procedure to demonstrate how leveraging the thermodynamic relationship between free energy and values of ensemble averaged all-atom potential energy improves the learning efficiency and accuracy of the free energy model. The result is significantly better sampling statistics of structural distribution functions. The theoretical framework presented here is demonstrated via implementations in both kernel-based and neural network machine learning regression methods and opens new ways to train accurate machine learning models for studying thermodynamic and response properties of complex molecular systems.</abstract><doi>10.48550/arxiv.2405.19386</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2405.19386
ispartof
issn
language eng
recordid cdi_arxiv_primary_2405_19386
source arXiv.org
subjects Physics - Chemical Physics
Physics - Computational Physics
title Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T15%3A58%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Thermodynamically%20Informed%20Multimodal%20Learning%20of%20High-Dimensional%20Free%20Energy%20Models%20in%20Molecular%20Coarse%20Graining&rft.au=Duschatko,%20Blake%20R&rft.date=2024-05-29&rft_id=info:doi/10.48550/arxiv.2405.19386&rft_dat=%3Carxiv_GOX%3E2405_19386%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true