LLload: An Easy-to-Use HPC Utilization Tool

The increasing use and cost of high performance computing (HPC) requires new easy-to-use tools to enable HPC users and HPC systems engineers to transparently understand the utilization of resources. The MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed a simple command, LLload, to mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Byun, Chansup, Reuther, Albert, Mullen, Julie, Anderson, LaToya, Arcand, William, Bergeron, Bill, Bestor, David, Bonn, Alexander, Burrill, Daniel, Gadepally, Vijay, Houle, Michael, Hubbell, Matthew, Jananthan, Hayden, Jones, Michael, Luszczek, Piotr, Michaleas, Peter, Milechin, Lauren, Morales, Guillermo, Prout, Andrew, Rosa, Antonio, Yee, Charles, Kepner, Jeremy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Byun, Chansup
Reuther, Albert
Mullen, Julie
Anderson, LaToya
Arcand, William
Bergeron, Bill
Bestor, David
Bonn, Alexander
Burrill, Daniel
Gadepally, Vijay
Houle, Michael
Hubbell, Matthew
Jananthan, Hayden
Jones, Michael
Luszczek, Piotr
Michaleas, Peter
Milechin, Lauren
Morales, Guillermo
Prout, Andrew
Rosa, Antonio
Yee, Charles
Kepner, Jeremy
description The increasing use and cost of high performance computing (HPC) requires new easy-to-use tools to enable HPC users and HPC systems engineers to transparently understand the utilization of resources. The MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed a simple command, LLload, to monitor and characterize HPC workloads. LLload plays an important role in identifying opportunities for better utilization of compute resources. LLload can be used to monitor jobs both programmatically and interactively. LLload can characterize users' jobs using various LLload options to achieve better efficiency. This information can be used to inform the user to optimize HPC workloads and improve both CPU and GPU utilization. This includes improvements using judicious oversubscription of the computing resources. Preliminary results suggest significant improvement in GPU utilization and overall throughput performance with GPU overloading in some cases. By enabling users to observe and fix incorrect job submission and/or inappropriate execution setups, LLload can increase the resource usage and improve the overall throughput performance. LLload is a light-weight, easy-to-use tool for both HPC users and HPC systems engineers to monitor HPC workloads to improve system utilization and efficiency.
doi_str_mv 10.48550/arxiv.2410.21036
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_21036</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_21036</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_210363</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBkaGJtxMmj7-OTkJ6ZYKTjmKbgmFlfqluTrhhanKngEOCuElmTmZFYllmTm5ymE5Ofn8DCwpiXmFKfyQmluBnk31xBnD12wsfEFRZm5iUWV8SDj48HGGxNWAQCSAS1q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>LLload: An Easy-to-Use HPC Utilization Tool</title><source>arXiv.org</source><creator>Byun, Chansup ; Reuther, Albert ; Mullen, Julie ; Anderson, LaToya ; Arcand, William ; Bergeron, Bill ; Bestor, David ; Bonn, Alexander ; Burrill, Daniel ; Gadepally, Vijay ; Houle, Michael ; Hubbell, Matthew ; Jananthan, Hayden ; Jones, Michael ; Luszczek, Piotr ; Michaleas, Peter ; Milechin, Lauren ; Morales, Guillermo ; Prout, Andrew ; Rosa, Antonio ; Yee, Charles ; Kepner, Jeremy</creator><creatorcontrib>Byun, Chansup ; Reuther, Albert ; Mullen, Julie ; Anderson, LaToya ; Arcand, William ; Bergeron, Bill ; Bestor, David ; Bonn, Alexander ; Burrill, Daniel ; Gadepally, Vijay ; Houle, Michael ; Hubbell, Matthew ; Jananthan, Hayden ; Jones, Michael ; Luszczek, Piotr ; Michaleas, Peter ; Milechin, Lauren ; Morales, Guillermo ; Prout, Andrew ; Rosa, Antonio ; Yee, Charles ; Kepner, Jeremy</creatorcontrib><description>The increasing use and cost of high performance computing (HPC) requires new easy-to-use tools to enable HPC users and HPC systems engineers to transparently understand the utilization of resources. The MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed a simple command, LLload, to monitor and characterize HPC workloads. LLload plays an important role in identifying opportunities for better utilization of compute resources. LLload can be used to monitor jobs both programmatically and interactively. LLload can characterize users' jobs using various LLload options to achieve better efficiency. This information can be used to inform the user to optimize HPC workloads and improve both CPU and GPU utilization. This includes improvements using judicious oversubscription of the computing resources. Preliminary results suggest significant improvement in GPU utilization and overall throughput performance with GPU overloading in some cases. By enabling users to observe and fix incorrect job submission and/or inappropriate execution setups, LLload can increase the resource usage and improve the overall throughput performance. LLload is a light-weight, easy-to-use tool for both HPC users and HPC systems engineers to monitor HPC workloads to improve system utilization and efficiency.</description><identifier>DOI: 10.48550/arxiv.2410.21036</identifier><language>eng</language><subject>Computer Science - Performance</subject><creationdate>2024-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.21036$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.21036$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Anderson, LaToya</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bergeron, Bill</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bonn, Alexander</creatorcontrib><creatorcontrib>Burrill, Daniel</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Luszczek, Piotr</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><title>LLload: An Easy-to-Use HPC Utilization Tool</title><description>The increasing use and cost of high performance computing (HPC) requires new easy-to-use tools to enable HPC users and HPC systems engineers to transparently understand the utilization of resources. The MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed a simple command, LLload, to monitor and characterize HPC workloads. LLload plays an important role in identifying opportunities for better utilization of compute resources. LLload can be used to monitor jobs both programmatically and interactively. LLload can characterize users' jobs using various LLload options to achieve better efficiency. This information can be used to inform the user to optimize HPC workloads and improve both CPU and GPU utilization. This includes improvements using judicious oversubscription of the computing resources. Preliminary results suggest significant improvement in GPU utilization and overall throughput performance with GPU overloading in some cases. By enabling users to observe and fix incorrect job submission and/or inappropriate execution setups, LLload can increase the resource usage and improve the overall throughput performance. LLload is a light-weight, easy-to-use tool for both HPC users and HPC systems engineers to monitor HPC workloads to improve system utilization and efficiency.</description><subject>Computer Science - Performance</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBkaGJtxMmj7-OTkJ6ZYKTjmKbgmFlfqluTrhhanKngEOCuElmTmZFYllmTm5ymE5Ofn8DCwpiXmFKfyQmluBnk31xBnD12wsfEFRZm5iUWV8SDj48HGGxNWAQCSAS1q</recordid><startdate>20241028</startdate><enddate>20241028</enddate><creator>Byun, Chansup</creator><creator>Reuther, Albert</creator><creator>Mullen, Julie</creator><creator>Anderson, LaToya</creator><creator>Arcand, William</creator><creator>Bergeron, Bill</creator><creator>Bestor, David</creator><creator>Bonn, Alexander</creator><creator>Burrill, Daniel</creator><creator>Gadepally, Vijay</creator><creator>Houle, Michael</creator><creator>Hubbell, Matthew</creator><creator>Jananthan, Hayden</creator><creator>Jones, Michael</creator><creator>Luszczek, Piotr</creator><creator>Michaleas, Peter</creator><creator>Milechin, Lauren</creator><creator>Morales, Guillermo</creator><creator>Prout, Andrew</creator><creator>Rosa, Antonio</creator><creator>Yee, Charles</creator><creator>Kepner, Jeremy</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241028</creationdate><title>LLload: An Easy-to-Use HPC Utilization Tool</title><author>Byun, Chansup ; Reuther, Albert ; Mullen, Julie ; Anderson, LaToya ; Arcand, William ; Bergeron, Bill ; Bestor, David ; Bonn, Alexander ; Burrill, Daniel ; Gadepally, Vijay ; Houle, Michael ; Hubbell, Matthew ; Jananthan, Hayden ; Jones, Michael ; Luszczek, Piotr ; Michaleas, Peter ; Milechin, Lauren ; Morales, Guillermo ; Prout, Andrew ; Rosa, Antonio ; Yee, Charles ; Kepner, Jeremy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_210363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Performance</topic><toplevel>online_resources</toplevel><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Anderson, LaToya</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bergeron, Bill</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bonn, Alexander</creatorcontrib><creatorcontrib>Burrill, Daniel</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Luszczek, Piotr</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Byun, Chansup</au><au>Reuther, Albert</au><au>Mullen, Julie</au><au>Anderson, LaToya</au><au>Arcand, William</au><au>Bergeron, Bill</au><au>Bestor, David</au><au>Bonn, Alexander</au><au>Burrill, Daniel</au><au>Gadepally, Vijay</au><au>Houle, Michael</au><au>Hubbell, Matthew</au><au>Jananthan, Hayden</au><au>Jones, Michael</au><au>Luszczek, Piotr</au><au>Michaleas, Peter</au><au>Milechin, Lauren</au><au>Morales, Guillermo</au><au>Prout, Andrew</au><au>Rosa, Antonio</au><au>Yee, Charles</au><au>Kepner, Jeremy</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LLload: An Easy-to-Use HPC Utilization Tool</atitle><date>2024-10-28</date><risdate>2024</risdate><abstract>The increasing use and cost of high performance computing (HPC) requires new easy-to-use tools to enable HPC users and HPC systems engineers to transparently understand the utilization of resources. The MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed a simple command, LLload, to monitor and characterize HPC workloads. LLload plays an important role in identifying opportunities for better utilization of compute resources. LLload can be used to monitor jobs both programmatically and interactively. LLload can characterize users' jobs using various LLload options to achieve better efficiency. This information can be used to inform the user to optimize HPC workloads and improve both CPU and GPU utilization. This includes improvements using judicious oversubscription of the computing resources. Preliminary results suggest significant improvement in GPU utilization and overall throughput performance with GPU overloading in some cases. By enabling users to observe and fix incorrect job submission and/or inappropriate execution setups, LLload can increase the resource usage and improve the overall throughput performance. LLload is a light-weight, easy-to-use tool for both HPC users and HPC systems engineers to monitor HPC workloads to improve system utilization and efficiency.</abstract><doi>10.48550/arxiv.2410.21036</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.21036
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_21036
source arXiv.org
subjects Computer Science - Performance
title LLload: An Easy-to-Use HPC Utilization Tool
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T21%3A21%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LLload:%20An%20Easy-to-Use%20HPC%20Utilization%20Tool&rft.au=Byun,%20Chansup&rft.date=2024-10-28&rft_id=info:doi/10.48550/arxiv.2410.21036&rft_dat=%3Carxiv_GOX%3E2410_21036%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true