High Performance Multivariate Geospatial Statistics on Manycore Systems

Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the ex...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2021-11, Vol.32 (11), p.2719-2733
Hauptverfasser: Salvana, Mary Lai O., Abdulah, Sameh, Huang, Huang, Ltaief, Hatem, Sun, Ying, Genton, Marc G., Keyes, David E.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2733
container_issue 11
container_start_page 2719
container_title IEEE transactions on parallel and distributed systems
container_volume 32
creator Salvana, Mary Lai O.
Abdulah, Sameh
Huang, Huang
Ltaief, Hatem
Sun, Ying
Genton, Marc G.
Keyes, David E.
description Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the expensive Gaussian log-likelihood function, which has impeded the adoption of multivariate geospatial models for large multivariate spatial datasets. However, this large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data coming from the widespread use of different data collection technologies. In this article, we develop and deploy large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and the approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation. The algorithm quantifies prediction performance and provides a benchmark for measuring the efficiency and accuracy of several approximation techniques in multivariate spatial modeling.
doi_str_mv 10.1109/TPDS.2021.3071423
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2541467845</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9397281</ieee_id><sourcerecordid>2541467845</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-6bce2056756340d5c8b4b340338388d7297a4fe3c3c12f073c56394ee46411503</originalsourceid><addsrcrecordid>eNo9kEFPAjEQhRujiYj-AONlE8-LnU677R4NKphAJAHPTSmzugR2sS0m_HuXQDy9d_jeTPIxdg98AMDLp8XsZT4QXMAAuQYp8IL1QCmTCzB42XUuVV4KKK_ZTYxrzkEqLntsNK6_vrMZhaoNW9d4yqb7Tap_XahdomxEbdy5VLtNNk9dxlT7mLVNNnXNwbeBsvkhJtrGW3ZVuU2ku3P22efb62I4zicfo_fh8yT3iEXKi6UnwVWhVYGSr5Q3S7nsGqJBY1ZalNrJitCjB1Fxjb4DS0kkCwmgOPbZ4-nuLrQ_e4rJrtt9aLqXVigJstBGqo6CE-VDG2Ogyu5CvXXhYIHboy979GWPvuzZV7d5OG1qIvrnSyy1MIB_eThlRA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2541467845</pqid></control><display><type>article</type><title>High Performance Multivariate Geospatial Statistics on Manycore Systems</title><source>IEEE/IET Electronic Library</source><creator>Salvana, Mary Lai O. ; Abdulah, Sameh ; Huang, Huang ; Ltaief, Hatem ; Sun, Ying ; Genton, Marc G. ; Keyes, David E.</creator><creatorcontrib>Salvana, Mary Lai O. ; Abdulah, Sameh ; Huang, Huang ; Ltaief, Hatem ; Sun, Ying ; Genton, Marc G. ; Keyes, David E.</creatorcontrib><description>Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the expensive Gaussian log-likelihood function, which has impeded the adoption of multivariate geospatial models for large multivariate spatial datasets. However, this large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data coming from the widespread use of different data collection technologies. In this article, we develop and deploy large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and the approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation. The algorithm quantifies prediction performance and provides a benchmark for measuring the efficiency and accuracy of several approximation techniques in multivariate spatial modeling.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2021.3071423</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Algorithms ; Approximation ; Computational modeling ; Concurrency ; Data collection ; Datasets ; Gaussian log-likelihood ; Geospatial analysis ; geospatial statistics ; Graphics processing units ; high-performance computing ; large multivariate spatial data ; low-rank approximation ; Mathematical model ; Mathematical models ; Meteorology ; Multivariate analysis ; multivariate modeling/prediction ; Numerical models ; Parameter estimation ; Predictive models ; Robustness (mathematics) ; Schedules ; Task scheduling</subject><ispartof>IEEE transactions on parallel and distributed systems, 2021-11, Vol.32 (11), p.2719-2733</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-6bce2056756340d5c8b4b340338388d7297a4fe3c3c12f073c56394ee46411503</citedby><cites>FETCH-LOGICAL-c336t-6bce2056756340d5c8b4b340338388d7297a4fe3c3c12f073c56394ee46411503</cites><orcidid>0000-0001-6467-2998 ; 0000-0002-8850-5753 ; 0000-0002-4052-7224 ; 0000-0002-5950-4698 ; 0000-0002-6897-1095 ; 0000-0003-4868-7713 ; 0000-0001-6703-4270</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9397281$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9397281$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Salvana, Mary Lai O.</creatorcontrib><creatorcontrib>Abdulah, Sameh</creatorcontrib><creatorcontrib>Huang, Huang</creatorcontrib><creatorcontrib>Ltaief, Hatem</creatorcontrib><creatorcontrib>Sun, Ying</creatorcontrib><creatorcontrib>Genton, Marc G.</creatorcontrib><creatorcontrib>Keyes, David E.</creatorcontrib><title>High Performance Multivariate Geospatial Statistics on Manycore Systems</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the expensive Gaussian log-likelihood function, which has impeded the adoption of multivariate geospatial models for large multivariate spatial datasets. However, this large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data coming from the widespread use of different data collection technologies. In this article, we develop and deploy large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and the approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation. The algorithm quantifies prediction performance and provides a benchmark for measuring the efficiency and accuracy of several approximation techniques in multivariate spatial modeling.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Approximation</subject><subject>Computational modeling</subject><subject>Concurrency</subject><subject>Data collection</subject><subject>Datasets</subject><subject>Gaussian log-likelihood</subject><subject>Geospatial analysis</subject><subject>geospatial statistics</subject><subject>Graphics processing units</subject><subject>high-performance computing</subject><subject>large multivariate spatial data</subject><subject>low-rank approximation</subject><subject>Mathematical model</subject><subject>Mathematical models</subject><subject>Meteorology</subject><subject>Multivariate analysis</subject><subject>multivariate modeling/prediction</subject><subject>Numerical models</subject><subject>Parameter estimation</subject><subject>Predictive models</subject><subject>Robustness (mathematics)</subject><subject>Schedules</subject><subject>Task scheduling</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kEFPAjEQhRujiYj-AONlE8-LnU677R4NKphAJAHPTSmzugR2sS0m_HuXQDy9d_jeTPIxdg98AMDLp8XsZT4QXMAAuQYp8IL1QCmTCzB42XUuVV4KKK_ZTYxrzkEqLntsNK6_vrMZhaoNW9d4yqb7Tap_XahdomxEbdy5VLtNNk9dxlT7mLVNNnXNwbeBsvkhJtrGW3ZVuU2ku3P22efb62I4zicfo_fh8yT3iEXKi6UnwVWhVYGSr5Q3S7nsGqJBY1ZalNrJitCjB1Fxjb4DS0kkCwmgOPbZ4-nuLrQ_e4rJrtt9aLqXVigJstBGqo6CE-VDG2Ogyu5CvXXhYIHboy979GWPvuzZV7d5OG1qIvrnSyy1MIB_eThlRA</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Salvana, Mary Lai O.</creator><creator>Abdulah, Sameh</creator><creator>Huang, Huang</creator><creator>Ltaief, Hatem</creator><creator>Sun, Ying</creator><creator>Genton, Marc G.</creator><creator>Keyes, David E.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-6467-2998</orcidid><orcidid>https://orcid.org/0000-0002-8850-5753</orcidid><orcidid>https://orcid.org/0000-0002-4052-7224</orcidid><orcidid>https://orcid.org/0000-0002-5950-4698</orcidid><orcidid>https://orcid.org/0000-0002-6897-1095</orcidid><orcidid>https://orcid.org/0000-0003-4868-7713</orcidid><orcidid>https://orcid.org/0000-0001-6703-4270</orcidid></search><sort><creationdate>20211101</creationdate><title>High Performance Multivariate Geospatial Statistics on Manycore Systems</title><author>Salvana, Mary Lai O. ; Abdulah, Sameh ; Huang, Huang ; Ltaief, Hatem ; Sun, Ying ; Genton, Marc G. ; Keyes, David E.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-6bce2056756340d5c8b4b340338388d7297a4fe3c3c12f073c56394ee46411503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Approximation</topic><topic>Computational modeling</topic><topic>Concurrency</topic><topic>Data collection</topic><topic>Datasets</topic><topic>Gaussian log-likelihood</topic><topic>Geospatial analysis</topic><topic>geospatial statistics</topic><topic>Graphics processing units</topic><topic>high-performance computing</topic><topic>large multivariate spatial data</topic><topic>low-rank approximation</topic><topic>Mathematical model</topic><topic>Mathematical models</topic><topic>Meteorology</topic><topic>Multivariate analysis</topic><topic>multivariate modeling/prediction</topic><topic>Numerical models</topic><topic>Parameter estimation</topic><topic>Predictive models</topic><topic>Robustness (mathematics)</topic><topic>Schedules</topic><topic>Task scheduling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Salvana, Mary Lai O.</creatorcontrib><creatorcontrib>Abdulah, Sameh</creatorcontrib><creatorcontrib>Huang, Huang</creatorcontrib><creatorcontrib>Ltaief, Hatem</creatorcontrib><creatorcontrib>Sun, Ying</creatorcontrib><creatorcontrib>Genton, Marc G.</creatorcontrib><creatorcontrib>Keyes, David E.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) Online</collection><collection>IEEE/IET Electronic Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Salvana, Mary Lai O.</au><au>Abdulah, Sameh</au><au>Huang, Huang</au><au>Ltaief, Hatem</au><au>Sun, Ying</au><au>Genton, Marc G.</au><au>Keyes, David E.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High Performance Multivariate Geospatial Statistics on Manycore Systems</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2021-11-01</date><risdate>2021</risdate><volume>32</volume><issue>11</issue><spage>2719</spage><epage>2733</epage><pages>2719-2733</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>Modeling and inferring spatial relationships and predicting missing values of environmental data are some of the main tasks of geospatial statisticians. These routine tasks are accomplished using multivariate geospatial models and the cokriging technique. The latter requires the evaluation of the expensive Gaussian log-likelihood function, which has impeded the adoption of multivariate geospatial models for large multivariate spatial datasets. However, this large-scale cokriging challenge provides a fertile ground for supercomputing implementations for the geospatial statistics community as it is paramount to scale computational capability to match the growth in environmental data coming from the widespread use of different data collection technologies. In this article, we develop and deploy large-scale multivariate spatial modeling and inference on parallel hardware architectures. To tackle the increasing complexity in matrix operations and the massive concurrency in parallel systems, we leverage low-rank matrix approximation techniques with task-based programming models and schedule the asynchronous computational tasks using a dynamic runtime system. The proposed framework provides both the dense and the approximated computations of the Gaussian log-likelihood function. It demonstrates accuracy robustness and performance scalability on a variety of computer systems. Using both synthetic and real datasets, the low-rank matrix approximation shows better performance compared to exact computation, while preserving the application requirements in both parameter estimation and prediction accuracy. We also propose a novel algorithm to assess the prediction accuracy after the online parameter estimation. The algorithm quantifies prediction performance and provides a benchmark for measuring the efficiency and accuracy of several approximation techniques in multivariate spatial modeling.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2021.3071423</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-6467-2998</orcidid><orcidid>https://orcid.org/0000-0002-8850-5753</orcidid><orcidid>https://orcid.org/0000-0002-4052-7224</orcidid><orcidid>https://orcid.org/0000-0002-5950-4698</orcidid><orcidid>https://orcid.org/0000-0002-6897-1095</orcidid><orcidid>https://orcid.org/0000-0003-4868-7713</orcidid><orcidid>https://orcid.org/0000-0001-6703-4270</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1045-9219
ispartof IEEE transactions on parallel and distributed systems, 2021-11, Vol.32 (11), p.2719-2733
issn 1045-9219
1558-2183
language eng
recordid cdi_proquest_journals_2541467845
source IEEE/IET Electronic Library
subjects Accuracy
Algorithms
Approximation
Computational modeling
Concurrency
Data collection
Datasets
Gaussian log-likelihood
Geospatial analysis
geospatial statistics
Graphics processing units
high-performance computing
large multivariate spatial data
low-rank approximation
Mathematical model
Mathematical models
Meteorology
Multivariate analysis
multivariate modeling/prediction
Numerical models
Parameter estimation
Predictive models
Robustness (mathematics)
Schedules
Task scheduling
title High Performance Multivariate Geospatial Statistics on Manycore Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T17%3A03%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High%20Performance%20Multivariate%20Geospatial%20Statistics%20on%20Manycore%20Systems&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Salvana,%20Mary%20Lai%20O.&rft.date=2021-11-01&rft.volume=32&rft.issue=11&rft.spage=2719&rft.epage=2733&rft.pages=2719-2733&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2021.3071423&rft_dat=%3Cproquest_RIE%3E2541467845%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2541467845&rft_id=info:pmid/&rft_ieee_id=9397281&rfr_iscdi=true