High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review

Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More speci...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Artificial intelligence review 2022-02, Vol.55 (2), p.1253-1287
Hauptverfasser: Das, Rituparna, Middya, Asif Iqbal, Roy, Sarbani
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1287
container_issue 2
container_start_page 1253
container_title The Artificial intelligence review
container_volume 55
creator Das, Rituparna
Middya, Asif Iqbal
Roy, Sarbani
description Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.
doi_str_mv 10.1007/s10462-021-09991-1
format Article
fullrecord <record><control><sourceid>proquest_sprin</sourceid><recordid>TN_cdi_proquest_journals_2627871779</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2627871779</sourcerecordid><originalsourceid>FETCH-LOGICAL-p157t-14c12dfc9eff3ebe367e1d649fae8aab9347f9d915ba6bc1dc04aad265c0dccb3</originalsourceid><addsrcrecordid>eNpFkMtOwzAQRS0EEuXxA6wssXbx2ElcL1EFFKkIFrCOJs64TZUmwXbK7xMoEquRro5m7hzGbkDOQUpzF0FmhRJSgZDWWhBwwmaQGy3MlJ-ymVSFFWqh4JxdxLiTUuYq0zNWrZrNlm8CdmOLgWNX87jtQ-KJwp6nZk88Umgoct8HchhT02147_nbi5rnHJvAh75tx4Rd4oIjd_1-wICpORAPdGjo64qdeWwjXf_NS_bx-PC-XIn169Pz8n4thqloEpA5ULV3lrzXVJEuDEFdZNYjLRArqzPjbW0hr7CoHNROZoi1KnIna-cqfcluj3uH0H-OFFO568fQTSdLVSizMGCMnSh9pOIQplco_FMgyx-Z5VFmOcksf2WWoL8BbpppAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2627871779</pqid></control><display><type>article</type><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><source>SpringerLink Journals</source><creator>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</creator><creatorcontrib>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</creatorcontrib><description>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</description><identifier>ISSN: 0269-2821</identifier><identifier>EISSN: 1573-7462</identifier><identifier>DOI: 10.1007/s10462-021-09991-1</identifier><language>eng</language><publisher>Dordrecht: Springer Netherlands</publisher><subject>Air monitoring ; Air pollution ; Artificial Intelligence ; Artificial neural networks ; Autoregressive models ; Comparative studies ; Computer Science ; Datasets ; Decision trees ; Deep learning ; Forecasting ; Horizon ; Learning theory ; Machine learning ; Markov chains ; Performance degradation ; Pollutants ; Pollution monitoring ; Root-mean-square errors ; Short term ; Smoothing ; Statistical analysis ; Time series</subject><ispartof>The Artificial intelligence review, 2022-02, Vol.55 (2), p.1253-1287</ispartof><rights>The Author(s), under exclusive licence to Springer Nature B.V. 2021</rights><rights>The Author(s), under exclusive licence to Springer Nature B.V. 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-7598-8266</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10462-021-09991-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10462-021-09991-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Das, Rituparna</creatorcontrib><creatorcontrib>Middya, Asif Iqbal</creatorcontrib><creatorcontrib>Roy, Sarbani</creatorcontrib><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><title>The Artificial intelligence review</title><addtitle>Artif Intell Rev</addtitle><description>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</description><subject>Air monitoring</subject><subject>Air pollution</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Autoregressive models</subject><subject>Comparative studies</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Deep learning</subject><subject>Forecasting</subject><subject>Horizon</subject><subject>Learning theory</subject><subject>Machine learning</subject><subject>Markov chains</subject><subject>Performance degradation</subject><subject>Pollutants</subject><subject>Pollution monitoring</subject><subject>Root-mean-square errors</subject><subject>Short term</subject><subject>Smoothing</subject><subject>Statistical analysis</subject><subject>Time series</subject><issn>0269-2821</issn><issn>1573-7462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpFkMtOwzAQRS0EEuXxA6wssXbx2ElcL1EFFKkIFrCOJs64TZUmwXbK7xMoEquRro5m7hzGbkDOQUpzF0FmhRJSgZDWWhBwwmaQGy3MlJ-ymVSFFWqh4JxdxLiTUuYq0zNWrZrNlm8CdmOLgWNX87jtQ-KJwp6nZk88Umgoct8HchhT02147_nbi5rnHJvAh75tx4Rd4oIjd_1-wICpORAPdGjo64qdeWwjXf_NS_bx-PC-XIn169Pz8n4thqloEpA5ULV3lrzXVJEuDEFdZNYjLRArqzPjbW0hr7CoHNROZoi1KnIna-cqfcluj3uH0H-OFFO568fQTSdLVSizMGCMnSh9pOIQplco_FMgyx-Z5VFmOcksf2WWoL8BbpppAg</recordid><startdate>20220201</startdate><enddate>20220201</enddate><creator>Das, Rituparna</creator><creator>Middya, Asif Iqbal</creator><creator>Roy, Sarbani</creator><general>Springer Netherlands</general><general>Springer Nature B.V</general><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PKEHL</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRQQA</scope><scope>PSYQQ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7598-8266</orcidid></search><sort><creationdate>20220201</creationdate><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><author>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p157t-14c12dfc9eff3ebe367e1d649fae8aab9347f9d915ba6bc1dc04aad265c0dccb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Air monitoring</topic><topic>Air pollution</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Autoregressive models</topic><topic>Comparative studies</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Deep learning</topic><topic>Forecasting</topic><topic>Horizon</topic><topic>Learning theory</topic><topic>Machine learning</topic><topic>Markov chains</topic><topic>Performance degradation</topic><topic>Pollutants</topic><topic>Pollution monitoring</topic><topic>Root-mean-square errors</topic><topic>Short term</topic><topic>Smoothing</topic><topic>Statistical analysis</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Das, Rituparna</creatorcontrib><creatorcontrib>Middya, Asif Iqbal</creatorcontrib><creatorcontrib>Roy, Sarbani</creatorcontrib><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Library &amp; Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Social Sciences</collection><collection>ProQuest One Psychology</collection><collection>ProQuest Central Basic</collection><jtitle>The Artificial intelligence review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Das, Rituparna</au><au>Middya, Asif Iqbal</au><au>Roy, Sarbani</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</atitle><jtitle>The Artificial intelligence review</jtitle><stitle>Artif Intell Rev</stitle><date>2022-02-01</date><risdate>2022</risdate><volume>55</volume><issue>2</issue><spage>1253</spage><epage>1287</epage><pages>1253-1287</pages><issn>0269-2821</issn><eissn>1573-7462</eissn><abstract>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</abstract><cop>Dordrecht</cop><pub>Springer Netherlands</pub><doi>10.1007/s10462-021-09991-1</doi><tpages>35</tpages><orcidid>https://orcid.org/0000-0002-7598-8266</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0269-2821
ispartof The Artificial intelligence review, 2022-02, Vol.55 (2), p.1253-1287
issn 0269-2821
1573-7462
language eng
recordid cdi_proquest_journals_2627871779
source SpringerLink Journals
subjects Air monitoring
Air pollution
Artificial Intelligence
Artificial neural networks
Autoregressive models
Comparative studies
Computer Science
Datasets
Decision trees
Deep learning
Forecasting
Horizon
Learning theory
Machine learning
Markov chains
Performance degradation
Pollutants
Pollution monitoring
Root-mean-square errors
Short term
Smoothing
Statistical analysis
Time series
title High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T23%3A10%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High%20granular%20and%20short%20term%20time%20series%20forecasting%20of%20PM2.5%20air%20pollutant%20-%20a%20comparative%20review&rft.jtitle=The%20Artificial%20intelligence%20review&rft.au=Das,%20Rituparna&rft.date=2022-02-01&rft.volume=55&rft.issue=2&rft.spage=1253&rft.epage=1287&rft.pages=1253-1287&rft.issn=0269-2821&rft.eissn=1573-7462&rft_id=info:doi/10.1007/s10462-021-09991-1&rft_dat=%3Cproquest_sprin%3E2627871779%3C/proquest_sprin%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2627871779&rft_id=info:pmid/&rfr_iscdi=true