Machine learning models to complete rainfall time series databases affected by missing or anomalous data

In recent years, artificial intelligence in geosciences is spreading more and more, thanks to the availability of a large amount of data. In particular, the development of automatic raingauges networks allows to get rainfall data and makes these techniques effective, even if the performance of artif...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Earth science informatics 2023-12, Vol.16 (4), p.3717-3728
Hauptverfasser: Lupi, Andrea, Luppichini, Marco, Barsanti, Michele, Bini, Monica, Giannecchini, Roberto
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3728
container_issue 4
container_start_page 3717
container_title Earth science informatics
container_volume 16
creator Lupi, Andrea
Luppichini, Marco
Barsanti, Michele
Bini, Monica
Giannecchini, Roberto
description In recent years, artificial intelligence in geosciences is spreading more and more, thanks to the availability of a large amount of data. In particular, the development of automatic raingauges networks allows to get rainfall data and makes these techniques effective, even if the performance of artificial intelligence models is a consequence of the coherency and quality of the input data. In this work, we intended to provide machine learning models capable of predicting rainfall data starting from the values of the nearest raingauges at one historic time point. Moreover, we investigated the influence of the anomalous input data on the prediction of rainfall data. We pursued these goals by applying machine learning models based on Linear Regression, LSTM and CNN architectures to several raingauges in Tuscany (central Italy). More than 75% of the cases show an R2 higher than 0.65 and a MAE lower than 4 mm. As expected, we emphasized a strong influence of the input data on the prediction capacity of the models. We quantified the model inaccuracy using the Pearson's correlation. Measurement anomalies in time series cause major errors in deep learning models. These anomalous data may be due to several factors such as temporary malfunctions of raingauges or weather conditions. We showed that, in both cases, the data-driven model features could highlight these situations, allowing a better management of the raingauges network and rainfall databases.
doi_str_mv 10.1007/s12145-023-01122-4
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2899520290</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2899520290</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-1b32dff89b156f986c28f7f4c959cd0e68572caf77b71672e7f3bfbb0405c5953</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYsoOIzzB1wFXFfzaJpkKYMvGHGj65CkNzORthmTzmL-vdGK7lzdw-U758CpqkuCrwnG4iYTShpeY8pqTAildXNSLYhsy6uR5PRXC3ZerXIOFjNCW0apXFS7Z-N2YQTUg0ljGLdoiB30GU0RuTjse5gAJRNGb_oeTWEAlCEFyKgzk7EmF2W8BzdBh-wRDaEUlJSYkBnjYPp4mNGL6qxEZFj93GX1dn_3un6sNy8PT-vbTe1Yy6aaWEY776WyhLdeydZR6YVvnOLKdRhayQV1xgthBWkFBeGZ9dbiBnPHFWfL6mrO3af4cYA86fd4SGOp1FQqxSmmCheKzpRLMecEXu9TGEw6aoL116h6HlWXUfX3qLopJjabcoHHLaS_6H9cn6akes4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2899520290</pqid></control><display><type>article</type><title>Machine learning models to complete rainfall time series databases affected by missing or anomalous data</title><source>Springer Nature - Complete Springer Journals</source><creator>Lupi, Andrea ; Luppichini, Marco ; Barsanti, Michele ; Bini, Monica ; Giannecchini, Roberto</creator><creatorcontrib>Lupi, Andrea ; Luppichini, Marco ; Barsanti, Michele ; Bini, Monica ; Giannecchini, Roberto</creatorcontrib><description>In recent years, artificial intelligence in geosciences is spreading more and more, thanks to the availability of a large amount of data. In particular, the development of automatic raingauges networks allows to get rainfall data and makes these techniques effective, even if the performance of artificial intelligence models is a consequence of the coherency and quality of the input data. In this work, we intended to provide machine learning models capable of predicting rainfall data starting from the values of the nearest raingauges at one historic time point. Moreover, we investigated the influence of the anomalous input data on the prediction of rainfall data. We pursued these goals by applying machine learning models based on Linear Regression, LSTM and CNN architectures to several raingauges in Tuscany (central Italy). More than 75% of the cases show an R2 higher than 0.65 and a MAE lower than 4 mm. As expected, we emphasized a strong influence of the input data on the prediction capacity of the models. We quantified the model inaccuracy using the Pearson's correlation. Measurement anomalies in time series cause major errors in deep learning models. These anomalous data may be due to several factors such as temporary malfunctions of raingauges or weather conditions. We showed that, in both cases, the data-driven model features could highlight these situations, allowing a better management of the raingauges network and rainfall databases.</description><identifier>ISSN: 1865-0473</identifier><identifier>EISSN: 1865-0481</identifier><identifier>DOI: 10.1007/s12145-023-01122-4</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Anomalies ; Artificial intelligence ; Deep learning ; Earth and Environmental Science ; Earth Sciences ; Earth System Sciences ; Hydrologic data ; Information Systems Applications (incl.Internet) ; Machine learning ; Modelling ; Ontology ; Rainfall ; Rainfall data ; Rainfall forecasting ; Simulation and Modeling ; Space Exploration and Astronautics ; Space Sciences (including Extraterrestrial Physics ; Time series ; Weather ; Weather conditions</subject><ispartof>Earth science informatics, 2023-12, Vol.16 (4), p.3717-3728</ispartof><rights>The Author(s) 2023</rights><rights>The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-1b32dff89b156f986c28f7f4c959cd0e68572caf77b71672e7f3bfbb0405c5953</citedby><cites>FETCH-LOGICAL-c363t-1b32dff89b156f986c28f7f4c959cd0e68572caf77b71672e7f3bfbb0405c5953</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12145-023-01122-4$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s12145-023-01122-4$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Lupi, Andrea</creatorcontrib><creatorcontrib>Luppichini, Marco</creatorcontrib><creatorcontrib>Barsanti, Michele</creatorcontrib><creatorcontrib>Bini, Monica</creatorcontrib><creatorcontrib>Giannecchini, Roberto</creatorcontrib><title>Machine learning models to complete rainfall time series databases affected by missing or anomalous data</title><title>Earth science informatics</title><addtitle>Earth Sci Inform</addtitle><description>In recent years, artificial intelligence in geosciences is spreading more and more, thanks to the availability of a large amount of data. In particular, the development of automatic raingauges networks allows to get rainfall data and makes these techniques effective, even if the performance of artificial intelligence models is a consequence of the coherency and quality of the input data. In this work, we intended to provide machine learning models capable of predicting rainfall data starting from the values of the nearest raingauges at one historic time point. Moreover, we investigated the influence of the anomalous input data on the prediction of rainfall data. We pursued these goals by applying machine learning models based on Linear Regression, LSTM and CNN architectures to several raingauges in Tuscany (central Italy). More than 75% of the cases show an R2 higher than 0.65 and a MAE lower than 4 mm. As expected, we emphasized a strong influence of the input data on the prediction capacity of the models. We quantified the model inaccuracy using the Pearson's correlation. Measurement anomalies in time series cause major errors in deep learning models. These anomalous data may be due to several factors such as temporary malfunctions of raingauges or weather conditions. We showed that, in both cases, the data-driven model features could highlight these situations, allowing a better management of the raingauges network and rainfall databases.</description><subject>Anomalies</subject><subject>Artificial intelligence</subject><subject>Deep learning</subject><subject>Earth and Environmental Science</subject><subject>Earth Sciences</subject><subject>Earth System Sciences</subject><subject>Hydrologic data</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>Machine learning</subject><subject>Modelling</subject><subject>Ontology</subject><subject>Rainfall</subject><subject>Rainfall data</subject><subject>Rainfall forecasting</subject><subject>Simulation and Modeling</subject><subject>Space Exploration and Astronautics</subject><subject>Space Sciences (including Extraterrestrial Physics</subject><subject>Time series</subject><subject>Weather</subject><subject>Weather conditions</subject><issn>1865-0473</issn><issn>1865-0481</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kEtLxDAUhYsoOIzzB1wFXFfzaJpkKYMvGHGj65CkNzORthmTzmL-vdGK7lzdw-U758CpqkuCrwnG4iYTShpeY8pqTAildXNSLYhsy6uR5PRXC3ZerXIOFjNCW0apXFS7Z-N2YQTUg0ljGLdoiB30GU0RuTjse5gAJRNGb_oeTWEAlCEFyKgzk7EmF2W8BzdBh-wRDaEUlJSYkBnjYPp4mNGL6qxEZFj93GX1dn_3un6sNy8PT-vbTe1Yy6aaWEY776WyhLdeydZR6YVvnOLKdRhayQV1xgthBWkFBeGZ9dbiBnPHFWfL6mrO3af4cYA86fd4SGOp1FQqxSmmCheKzpRLMecEXu9TGEw6aoL116h6HlWXUfX3qLopJjabcoHHLaS_6H9cn6akes4</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Lupi, Andrea</creator><creator>Luppichini, Marco</creator><creator>Barsanti, Michele</creator><creator>Bini, Monica</creator><creator>Giannecchini, Roberto</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7TG</scope><scope>7XB</scope><scope>88I</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KL.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2P</scope><scope>P5Z</scope><scope>P62</scope><scope>PCBAR</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20231201</creationdate><title>Machine learning models to complete rainfall time series databases affected by missing or anomalous data</title><author>Lupi, Andrea ; Luppichini, Marco ; Barsanti, Michele ; Bini, Monica ; Giannecchini, Roberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-1b32dff89b156f986c28f7f4c959cd0e68572caf77b71672e7f3bfbb0405c5953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Anomalies</topic><topic>Artificial intelligence</topic><topic>Deep learning</topic><topic>Earth and Environmental Science</topic><topic>Earth Sciences</topic><topic>Earth System Sciences</topic><topic>Hydrologic data</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>Machine learning</topic><topic>Modelling</topic><topic>Ontology</topic><topic>Rainfall</topic><topic>Rainfall data</topic><topic>Rainfall forecasting</topic><topic>Simulation and Modeling</topic><topic>Space Exploration and Astronautics</topic><topic>Space Sciences (including Extraterrestrial Physics</topic><topic>Time series</topic><topic>Weather</topic><topic>Weather conditions</topic><toplevel>online_resources</toplevel><creatorcontrib>Lupi, Andrea</creatorcontrib><creatorcontrib>Luppichini, Marco</creatorcontrib><creatorcontrib>Barsanti, Michele</creatorcontrib><creatorcontrib>Bini, Monica</creatorcontrib><creatorcontrib>Giannecchini, Roberto</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Earth science informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lupi, Andrea</au><au>Luppichini, Marco</au><au>Barsanti, Michele</au><au>Bini, Monica</au><au>Giannecchini, Roberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine learning models to complete rainfall time series databases affected by missing or anomalous data</atitle><jtitle>Earth science informatics</jtitle><stitle>Earth Sci Inform</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>16</volume><issue>4</issue><spage>3717</spage><epage>3728</epage><pages>3717-3728</pages><issn>1865-0473</issn><eissn>1865-0481</eissn><abstract>In recent years, artificial intelligence in geosciences is spreading more and more, thanks to the availability of a large amount of data. In particular, the development of automatic raingauges networks allows to get rainfall data and makes these techniques effective, even if the performance of artificial intelligence models is a consequence of the coherency and quality of the input data. In this work, we intended to provide machine learning models capable of predicting rainfall data starting from the values of the nearest raingauges at one historic time point. Moreover, we investigated the influence of the anomalous input data on the prediction of rainfall data. We pursued these goals by applying machine learning models based on Linear Regression, LSTM and CNN architectures to several raingauges in Tuscany (central Italy). More than 75% of the cases show an R2 higher than 0.65 and a MAE lower than 4 mm. As expected, we emphasized a strong influence of the input data on the prediction capacity of the models. We quantified the model inaccuracy using the Pearson's correlation. Measurement anomalies in time series cause major errors in deep learning models. These anomalous data may be due to several factors such as temporary malfunctions of raingauges or weather conditions. We showed that, in both cases, the data-driven model features could highlight these situations, allowing a better management of the raingauges network and rainfall databases.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s12145-023-01122-4</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1865-0473
ispartof Earth science informatics, 2023-12, Vol.16 (4), p.3717-3728
issn 1865-0473
1865-0481
language eng
recordid cdi_proquest_journals_2899520290
source Springer Nature - Complete Springer Journals
subjects Anomalies
Artificial intelligence
Deep learning
Earth and Environmental Science
Earth Sciences
Earth System Sciences
Hydrologic data
Information Systems Applications (incl.Internet)
Machine learning
Modelling
Ontology
Rainfall
Rainfall data
Rainfall forecasting
Simulation and Modeling
Space Exploration and Astronautics
Space Sciences (including Extraterrestrial Physics
Time series
Weather
Weather conditions
title Machine learning models to complete rainfall time series databases affected by missing or anomalous data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T20%3A23%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20learning%20models%20to%20complete%20rainfall%20time%20series%20databases%20affected%20by%20missing%20or%20anomalous%20data&rft.jtitle=Earth%20science%20informatics&rft.au=Lupi,%20Andrea&rft.date=2023-12-01&rft.volume=16&rft.issue=4&rft.spage=3717&rft.epage=3728&rft.pages=3717-3728&rft.issn=1865-0473&rft.eissn=1865-0481&rft_id=info:doi/10.1007/s12145-023-01122-4&rft_dat=%3Cproquest_cross%3E2899520290%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2899520290&rft_id=info:pmid/&rfr_iscdi=true