Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations
In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simult...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 35 |
---|---|
container_issue | |
container_start_page | 27 |
container_title | |
container_volume | |
creator | Mei Wen Huayou Su Wenjie Wei Nan Wu Xing Cai Chunyuan Zhang |
description | In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simultaneously use both the hardware components requires great care when developing the parallel implementations. The challenges include (1) finding a balanced division of the workload between the CPU and GPU sides, and (2) hiding various overheads by overlapping computations with CPU-GPU data transfers and/or MPI communications. We study these issues in the context of real-world sedimentary basin simulations. Numerical experiments show that an appropriately devised CPU-GPU hybrid implementation is able to handle a global mesh resolution of 131,072*131,072, and a double-precision rate of 62 TFlops is achieved by using 1024 GPUs and 12288 CPU cores on Tianhe-1A. Such an extreme computing capability will be of great importance for carrying out high-resolution and continental-scale stratigraphic simulations in future. |
doi_str_mv | 10.1109/CLUSTER.2012.37 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6337853</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6337853</ieee_id><sourcerecordid>6337853</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-5cdd4370ed114d100d6eb1e55d64028fa6610d9f5b76d3269395a79294b8ea9f3</originalsourceid><addsrcrecordid>eNotjktPwzAQhM1LIpScOXDJHSV4ba8fRxqVghQJRJpz5WAHGTUpisuBf495zGX0jTS7Q8gV0AqAmtu66drN6qViFFjF1RG5oEoaFJoqPCYZA6lLw5CfkNwoDUIqzgRj5pRkgMhKZEKckzzGd5qkQVMjM7LsYpjeCkjRTbF-7mJhJ_eLiesfHvZz0XoXRj8d7PxVLG1qFG0YP3f2EPZTvCRng91Fn__7gnT3q039UDZP68f6rikDKDyU-Oqc4Ip6ByBc-uCk78EjOiko04OVEqgzA_ZKOs6k4QatMsyIXntrBr4g1393g_d--zGHMc3ZSs6VRs6_ASfFSxc</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Mei Wen ; Huayou Su ; Wenjie Wei ; Nan Wu ; Xing Cai ; Chunyuan Zhang</creator><creatorcontrib>Mei Wen ; Huayou Su ; Wenjie Wei ; Nan Wu ; Xing Cai ; Chunyuan Zhang</creatorcontrib><description>In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simultaneously use both the hardware components requires great care when developing the parallel implementations. The challenges include (1) finding a balanced division of the workload between the CPU and GPU sides, and (2) hiding various overheads by overlapping computations with CPU-GPU data transfers and/or MPI communications. We study these issues in the context of real-world sedimentary basin simulations. Numerical experiments show that an appropriately devised CPU-GPU hybrid implementation is able to handle a global mesh resolution of 131,072*131,072, and a double-precision rate of 62 TFlops is achieved by using 1024 GPUs and 12288 CPU cores on Tianhe-1A. Such an extreme computing capability will be of great importance for carrying out high-resolution and continental-scale stratigraphic simulations in future.</description><identifier>ISSN: 1552-5244</identifier><identifier>ISBN: 9781467324229</identifier><identifier>ISBN: 1467324221</identifier><identifier>EISSN: 2168-9253</identifier><identifier>EISBN: 0769548075</identifier><identifier>EISBN: 9780769548074</identifier><identifier>DOI: 10.1109/CLUSTER.2012.37</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computational modeling ; CPU-GPU hybrid computing ; Dual-lithology sedimentary basin simulation ; Graphics processing unit ; Hardware ; Instruction sets ; Kernel ; Mathematical model ; MPI/OpenMP/CUDA ; Multicore processing ; Tianhe-1A Hunan Solution</subject><ispartof>2012 IEEE International Conference on Cluster Computing, 2012, p.27-35</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6337853$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6337853$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Mei Wen</creatorcontrib><creatorcontrib>Huayou Su</creatorcontrib><creatorcontrib>Wenjie Wei</creatorcontrib><creatorcontrib>Nan Wu</creatorcontrib><creatorcontrib>Xing Cai</creatorcontrib><creatorcontrib>Chunyuan Zhang</creatorcontrib><title>Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations</title><title>2012 IEEE International Conference on Cluster Computing</title><addtitle>CLUSTR</addtitle><description>In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simultaneously use both the hardware components requires great care when developing the parallel implementations. The challenges include (1) finding a balanced division of the workload between the CPU and GPU sides, and (2) hiding various overheads by overlapping computations with CPU-GPU data transfers and/or MPI communications. We study these issues in the context of real-world sedimentary basin simulations. Numerical experiments show that an appropriately devised CPU-GPU hybrid implementation is able to handle a global mesh resolution of 131,072*131,072, and a double-precision rate of 62 TFlops is achieved by using 1024 GPUs and 12288 CPU cores on Tianhe-1A. Such an extreme computing capability will be of great importance for carrying out high-resolution and continental-scale stratigraphic simulations in future.</description><subject>Computational modeling</subject><subject>CPU-GPU hybrid computing</subject><subject>Dual-lithology sedimentary basin simulation</subject><subject>Graphics processing unit</subject><subject>Hardware</subject><subject>Instruction sets</subject><subject>Kernel</subject><subject>Mathematical model</subject><subject>MPI/OpenMP/CUDA</subject><subject>Multicore processing</subject><subject>Tianhe-1A Hunan Solution</subject><issn>1552-5244</issn><issn>2168-9253</issn><isbn>9781467324229</isbn><isbn>1467324221</isbn><isbn>0769548075</isbn><isbn>9780769548074</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjktPwzAQhM1LIpScOXDJHSV4ba8fRxqVghQJRJpz5WAHGTUpisuBf495zGX0jTS7Q8gV0AqAmtu66drN6qViFFjF1RG5oEoaFJoqPCYZA6lLw5CfkNwoDUIqzgRj5pRkgMhKZEKckzzGd5qkQVMjM7LsYpjeCkjRTbF-7mJhJ_eLiesfHvZz0XoXRj8d7PxVLG1qFG0YP3f2EPZTvCRng91Fn__7gnT3q039UDZP68f6rikDKDyU-Oqc4Ip6ByBc-uCk78EjOiko04OVEqgzA_ZKOs6k4QatMsyIXntrBr4g1393g_d--zGHMc3ZSs6VRs6_ASfFSxc</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>Mei Wen</creator><creator>Huayou Su</creator><creator>Wenjie Wei</creator><creator>Nan Wu</creator><creator>Xing Cai</creator><creator>Chunyuan Zhang</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations</title><author>Mei Wen ; Huayou Su ; Wenjie Wei ; Nan Wu ; Xing Cai ; Chunyuan Zhang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-5cdd4370ed114d100d6eb1e55d64028fa6610d9f5b76d3269395a79294b8ea9f3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Computational modeling</topic><topic>CPU-GPU hybrid computing</topic><topic>Dual-lithology sedimentary basin simulation</topic><topic>Graphics processing unit</topic><topic>Hardware</topic><topic>Instruction sets</topic><topic>Kernel</topic><topic>Mathematical model</topic><topic>MPI/OpenMP/CUDA</topic><topic>Multicore processing</topic><topic>Tianhe-1A Hunan Solution</topic><toplevel>online_resources</toplevel><creatorcontrib>Mei Wen</creatorcontrib><creatorcontrib>Huayou Su</creatorcontrib><creatorcontrib>Wenjie Wei</creatorcontrib><creatorcontrib>Nan Wu</creatorcontrib><creatorcontrib>Xing Cai</creatorcontrib><creatorcontrib>Chunyuan Zhang</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mei Wen</au><au>Huayou Su</au><au>Wenjie Wei</au><au>Nan Wu</au><au>Xing Cai</au><au>Chunyuan Zhang</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations</atitle><btitle>2012 IEEE International Conference on Cluster Computing</btitle><stitle>CLUSTR</stitle><date>2012-09</date><risdate>2012</risdate><spage>27</spage><epage>35</epage><pages>27-35</pages><issn>1552-5244</issn><eissn>2168-9253</eissn><isbn>9781467324229</isbn><isbn>1467324221</isbn><eisbn>0769548075</eisbn><eisbn>9780769548074</eisbn><coden>IEEPAD</coden><abstract>In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simultaneously use both the hardware components requires great care when developing the parallel implementations. The challenges include (1) finding a balanced division of the workload between the CPU and GPU sides, and (2) hiding various overheads by overlapping computations with CPU-GPU data transfers and/or MPI communications. We study these issues in the context of real-world sedimentary basin simulations. Numerical experiments show that an appropriately devised CPU-GPU hybrid implementation is able to handle a global mesh resolution of 131,072*131,072, and a double-precision rate of 62 TFlops is achieved by using 1024 GPUs and 12288 CPU cores on Tianhe-1A. Such an extreme computing capability will be of great importance for carrying out high-resolution and continental-scale stratigraphic simulations in future.</abstract><pub>IEEE</pub><doi>10.1109/CLUSTER.2012.37</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1552-5244 |
ispartof | 2012 IEEE International Conference on Cluster Computing, 2012, p.27-35 |
issn | 1552-5244 2168-9253 |
language | eng |
recordid | cdi_ieee_primary_6337853 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Computational modeling CPU-GPU hybrid computing Dual-lithology sedimentary basin simulation Graphics processing unit Hardware Instruction sets Kernel Mathematical model MPI/OpenMP/CUDA Multicore processing Tianhe-1A Hunan Solution |
title | Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T01%3A48%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Using%201000+%20GPUs%20and%2010000+%20CPUs%20for%20Sedimentary%20Basin%20Simulations&rft.btitle=2012%20IEEE%20International%20Conference%20on%20Cluster%20Computing&rft.au=Mei%20Wen&rft.date=2012-09&rft.spage=27&rft.epage=35&rft.pages=27-35&rft.issn=1552-5244&rft.eissn=2168-9253&rft.isbn=9781467324229&rft.isbn_list=1467324221&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CLUSTER.2012.37&rft_dat=%3Cieee_6IE%3E6337853%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=0769548075&rft.eisbn_list=9780769548074&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6337853&rfr_iscdi=true |