GROMACS in the cloud: A global supercomputer to speed up alchemical drug design

We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2022-05
Hauptverfasser: Kutzner, Carsten, Kniep, Christian, Cherian, Austin, Nordstrom, Ludvig, Grubmüller, Helmut, de Groot, Bert L, Gapsys, Vytautas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Kutzner, Carsten
Kniep, Christian
Cherian, Austin
Nordstrom, Ludvig
Grubmüller, Helmut
de Groot, Bert L
Gapsys, Vytautas
description We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen. Further, we find that high-throughput ligand-screening for protein-ligand binding affinity estimation can be accelerated dramatically by using global cloud resources. For a ligand screening study consisting of 19,872 independent simulations, we used all hardware that was available in the cloud at the time of the study. The computations scaled-up to reach peak performances using more than 4,000 instances, 140,000 cores, and 3,000 GPUs simultaneously around the globe. Our simulation ensemble finished in about two days in the cloud, while weeks would be required to complete the task on a typical on-premises cluster consisting of several hundred nodes. We demonstrate that the costs of such and similar studies can be drastically reduced with a checkpoint-restart protocol that allows to use cheap Spot pricing and by using instance types with optimal cost-efficiency.
doi_str_mv 10.48550/arxiv.2201.06372
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2201_06372</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2621116233</sourcerecordid><originalsourceid>FETCH-LOGICAL-a523-1e8737913a618dfe42b7a89e12a4880359dba74b158b3646181139a4ca46ab9a3</originalsourceid><addsrcrecordid>eNotkE1Lw0AURQdBsNT-AFcOuE6d9-YjE3elaCtUCtp9eEmmbUraxJmM6L83tq7u5nC55zJ2B2KqrNbikfx3_TVFFDAVRqZ4xUYoJSRWId6wSQgHIQSaFLWWI7ZevK_fZvMPXp94v3e8bNpYPfEZ3zVtQQ0PsXO-bI9d7J3nfctD51zFY8epKffuWJcDVPm445UL9e50y6631AQ3-c8x27w8b-bLZLVevM5nq4Q0ygScTWWagSQDtto6hUVKNnOApKwVUmdVQakqQNtCGjVAADIjVZIyVGQkx-z-Unu2zTtfH8n_5H_W-dl6IB4uROfbz-hCnx_a6E_DphwNAoAZXpG_b0VYJg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2621116233</pqid></control><display><type>article</type><title>GROMACS in the cloud: A global supercomputer to speed up alchemical drug design</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Kutzner, Carsten ; Kniep, Christian ; Cherian, Austin ; Nordstrom, Ludvig ; Grubmüller, Helmut ; de Groot, Bert L ; Gapsys, Vytautas</creator><creatorcontrib>Kutzner, Carsten ; Kniep, Christian ; Cherian, Austin ; Nordstrom, Ludvig ; Grubmüller, Helmut ; de Groot, Bert L ; Gapsys, Vytautas</creatorcontrib><description>We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen. Further, we find that high-throughput ligand-screening for protein-ligand binding affinity estimation can be accelerated dramatically by using global cloud resources. For a ligand screening study consisting of 19,872 independent simulations, we used all hardware that was available in the cloud at the time of the study. The computations scaled-up to reach peak performances using more than 4,000 instances, 140,000 cores, and 3,000 GPUs simultaneously around the globe. Our simulation ensemble finished in about two days in the cloud, while weeks would be required to complete the task on a typical on-premises cluster consisting of several hundred nodes. We demonstrate that the costs of such and similar studies can be drastically reduced with a checkpoint-restart protocol that allows to use cheap Spot pricing and by using instance types with optimal cost-efficiency.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2201.06372</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Acceleration ; Binding ; Cloud computing ; Clusters ; Computer Science - Distributed, Parallel, and Cluster Computing ; Costs ; Free energy ; Graphics processing units ; Hardware ; Ligands ; Molecular dynamics ; Physics - Biological Physics ; Physics - Computational Physics ; Proteins ; Quantitative Biology - Biomolecules ; Screening ; Simulation ; Supercomputers ; Web services</subject><ispartof>arXiv.org, 2022-05</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2201.06372$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1021/acs.jcim.2c00044$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Kutzner, Carsten</creatorcontrib><creatorcontrib>Kniep, Christian</creatorcontrib><creatorcontrib>Cherian, Austin</creatorcontrib><creatorcontrib>Nordstrom, Ludvig</creatorcontrib><creatorcontrib>Grubmüller, Helmut</creatorcontrib><creatorcontrib>de Groot, Bert L</creatorcontrib><creatorcontrib>Gapsys, Vytautas</creatorcontrib><title>GROMACS in the cloud: A global supercomputer to speed up alchemical drug design</title><title>arXiv.org</title><description>We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen. Further, we find that high-throughput ligand-screening for protein-ligand binding affinity estimation can be accelerated dramatically by using global cloud resources. For a ligand screening study consisting of 19,872 independent simulations, we used all hardware that was available in the cloud at the time of the study. The computations scaled-up to reach peak performances using more than 4,000 instances, 140,000 cores, and 3,000 GPUs simultaneously around the globe. Our simulation ensemble finished in about two days in the cloud, while weeks would be required to complete the task on a typical on-premises cluster consisting of several hundred nodes. We demonstrate that the costs of such and similar studies can be drastically reduced with a checkpoint-restart protocol that allows to use cheap Spot pricing and by using instance types with optimal cost-efficiency.</description><subject>Acceleration</subject><subject>Binding</subject><subject>Cloud computing</subject><subject>Clusters</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Costs</subject><subject>Free energy</subject><subject>Graphics processing units</subject><subject>Hardware</subject><subject>Ligands</subject><subject>Molecular dynamics</subject><subject>Physics - Biological Physics</subject><subject>Physics - Computational Physics</subject><subject>Proteins</subject><subject>Quantitative Biology - Biomolecules</subject><subject>Screening</subject><subject>Simulation</subject><subject>Supercomputers</subject><subject>Web services</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotkE1Lw0AURQdBsNT-AFcOuE6d9-YjE3elaCtUCtp9eEmmbUraxJmM6L83tq7u5nC55zJ2B2KqrNbikfx3_TVFFDAVRqZ4xUYoJSRWId6wSQgHIQSaFLWWI7ZevK_fZvMPXp94v3e8bNpYPfEZ3zVtQQ0PsXO-bI9d7J3nfctD51zFY8epKffuWJcDVPm445UL9e50y6631AQ3-c8x27w8b-bLZLVevM5nq4Q0ygScTWWagSQDtto6hUVKNnOApKwVUmdVQakqQNtCGjVAADIjVZIyVGQkx-z-Unu2zTtfH8n_5H_W-dl6IB4uROfbz-hCnx_a6E_DphwNAoAZXpG_b0VYJg</recordid><startdate>20220513</startdate><enddate>20220513</enddate><creator>Kutzner, Carsten</creator><creator>Kniep, Christian</creator><creator>Cherian, Austin</creator><creator>Nordstrom, Ludvig</creator><creator>Grubmüller, Helmut</creator><creator>de Groot, Bert L</creator><creator>Gapsys, Vytautas</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20220513</creationdate><title>GROMACS in the cloud: A global supercomputer to speed up alchemical drug design</title><author>Kutzner, Carsten ; Kniep, Christian ; Cherian, Austin ; Nordstrom, Ludvig ; Grubmüller, Helmut ; de Groot, Bert L ; Gapsys, Vytautas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a523-1e8737913a618dfe42b7a89e12a4880359dba74b158b3646181139a4ca46ab9a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Acceleration</topic><topic>Binding</topic><topic>Cloud computing</topic><topic>Clusters</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Costs</topic><topic>Free energy</topic><topic>Graphics processing units</topic><topic>Hardware</topic><topic>Ligands</topic><topic>Molecular dynamics</topic><topic>Physics - Biological Physics</topic><topic>Physics - Computational Physics</topic><topic>Proteins</topic><topic>Quantitative Biology - Biomolecules</topic><topic>Screening</topic><topic>Simulation</topic><topic>Supercomputers</topic><topic>Web services</topic><toplevel>online_resources</toplevel><creatorcontrib>Kutzner, Carsten</creatorcontrib><creatorcontrib>Kniep, Christian</creatorcontrib><creatorcontrib>Cherian, Austin</creatorcontrib><creatorcontrib>Nordstrom, Ludvig</creatorcontrib><creatorcontrib>Grubmüller, Helmut</creatorcontrib><creatorcontrib>de Groot, Bert L</creatorcontrib><creatorcontrib>Gapsys, Vytautas</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kutzner, Carsten</au><au>Kniep, Christian</au><au>Cherian, Austin</au><au>Nordstrom, Ludvig</au><au>Grubmüller, Helmut</au><au>de Groot, Bert L</au><au>Gapsys, Vytautas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GROMACS in the cloud: A global supercomputer to speed up alchemical drug design</atitle><jtitle>arXiv.org</jtitle><date>2022-05-13</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>We assess costs and efficiency of state-of-the-art high performance cloud computing compared to a traditional on-premises compute cluster. Our use case are atomistic simulations carried out with the GROMACS molecular dynamics (MD) toolkit with a focus on alchemical protein-ligand binding free energy calculations. We set up a compute cluster in the Amazon Web Services (AWS) cloud that incorporates various different instances with Intel, AMD, and ARM CPUs, some with GPU acceleration. Using representative biomolecular simulation systems we benchmark how GROMACS performs on individual instances and across multiple instances. Thereby we assess which instances deliver the highest performance and which are the most cost-efficient ones for our use case. We find that, in terms of total costs including hardware, personnel, room, energy and cooling, producing MD trajectories in the cloud can be as cost-efficient as an on-premises cluster given that optimal cloud instances are chosen. Further, we find that high-throughput ligand-screening for protein-ligand binding affinity estimation can be accelerated dramatically by using global cloud resources. For a ligand screening study consisting of 19,872 independent simulations, we used all hardware that was available in the cloud at the time of the study. The computations scaled-up to reach peak performances using more than 4,000 instances, 140,000 cores, and 3,000 GPUs simultaneously around the globe. Our simulation ensemble finished in about two days in the cloud, while weeks would be required to complete the task on a typical on-premises cluster consisting of several hundred nodes. We demonstrate that the costs of such and similar studies can be drastically reduced with a checkpoint-restart protocol that allows to use cheap Spot pricing and by using instance types with optimal cost-efficiency.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2201.06372</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-05
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2201_06372
source arXiv.org; Free E- Journals
subjects Acceleration
Binding
Cloud computing
Clusters
Computer Science - Distributed, Parallel, and Cluster Computing
Costs
Free energy
Graphics processing units
Hardware
Ligands
Molecular dynamics
Physics - Biological Physics
Physics - Computational Physics
Proteins
Quantitative Biology - Biomolecules
Screening
Simulation
Supercomputers
Web services
title GROMACS in the cloud: A global supercomputer to speed up alchemical drug design
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T01%3A59%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GROMACS%20in%20the%20cloud:%20A%20global%20supercomputer%20to%20speed%20up%20alchemical%20drug%20design&rft.jtitle=arXiv.org&rft.au=Kutzner,%20Carsten&rft.date=2022-05-13&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2201.06372&rft_dat=%3Cproquest_arxiv%3E2621116233%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2621116233&rft_id=info:pmid/&rfr_iscdi=true