High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review

Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More speci...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Artificial intelligence review 2022-02, Vol.55 (2), p.1253-1287
Hauptverfasser:	Das, Rituparna, Middya, Asif Iqbal, Roy, Sarbani
Format:	Artikel
Sprache:	eng
Schlagworte:	Air monitoring Air pollution Artificial Intelligence Artificial neural networks Autoregressive models Comparative studies Computer Science Datasets Decision trees Deep learning Forecasting Horizon Learning theory Machine learning Markov chains Performance degradation Pollutants Pollution monitoring Root-mean-square errors Short term Smoothing Statistical analysis Time series
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1287
container_issue	2
container_start_page	1253
container_title	The Artificial intelligence review
container_volume	55
creator	Das, Rituparna Middya, Asif Iqbal Roy, Sarbani
description	Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.
doi_str_mv	10.1007/s10462-021-09991-1
format	Article
fullrecord	<record><control><sourceid>proquest_sprin</sourceid><recordid>TN_cdi_proquest_journals_2627871779</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2627871779</sourcerecordid><originalsourceid>FETCH-LOGICAL-p157t-14c12dfc9eff3ebe367e1d649fae8aab9347f9d915ba6bc1dc04aad265c0dccb3</originalsourceid><addsrcrecordid>eNpFkMtOwzAQRS0EEuXxA6wssXbx2ElcL1EFFKkIFrCOJs64TZUmwXbK7xMoEquRro5m7hzGbkDOQUpzF0FmhRJSgZDWWhBwwmaQGy3MlJ-ymVSFFWqh4JxdxLiTUuYq0zNWrZrNlm8CdmOLgWNX87jtQ-KJwp6nZk88Umgoct8HchhT02147_nbi5rnHJvAh75tx4Rd4oIjd_1-wICpORAPdGjo64qdeWwjXf_NS_bx-PC-XIn169Pz8n4thqloEpA5ULV3lrzXVJEuDEFdZNYjLRArqzPjbW0hr7CoHNROZoi1KnIna-cqfcluj3uH0H-OFFO568fQTSdLVSizMGCMnSh9pOIQplco_FMgyx-Z5VFmOcksf2WWoL8BbpppAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2627871779</pqid></control><display><type>article</type><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><source>SpringerLink Journals</source><creator>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</creator><creatorcontrib>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</creatorcontrib><description>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</description><identifier>ISSN: 0269-2821</identifier><identifier>EISSN: 1573-7462</identifier><identifier>DOI: 10.1007/s10462-021-09991-1</identifier><language>eng</language><publisher>Dordrecht: Springer Netherlands</publisher><subject>Air monitoring ; Air pollution ; Artificial Intelligence ; Artificial neural networks ; Autoregressive models ; Comparative studies ; Computer Science ; Datasets ; Decision trees ; Deep learning ; Forecasting ; Horizon ; Learning theory ; Machine learning ; Markov chains ; Performance degradation ; Pollutants ; Pollution monitoring ; Root-mean-square errors ; Short term ; Smoothing ; Statistical analysis ; Time series</subject><ispartof>The Artificial intelligence review, 2022-02, Vol.55 (2), p.1253-1287</ispartof><rights>The Author(s), under exclusive licence to Springer Nature B.V. 2021</rights><rights>The Author(s), under exclusive licence to Springer Nature B.V. 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0002-7598-8266</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10462-021-09991-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10462-021-09991-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Das, Rituparna</creatorcontrib><creatorcontrib>Middya, Asif Iqbal</creatorcontrib><creatorcontrib>Roy, Sarbani</creatorcontrib><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><title>The Artificial intelligence review</title><addtitle>Artif Intell Rev</addtitle><description>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</description><subject>Air monitoring</subject><subject>Air pollution</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Autoregressive models</subject><subject>Comparative studies</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Deep learning</subject><subject>Forecasting</subject><subject>Horizon</subject><subject>Learning theory</subject><subject>Machine learning</subject><subject>Markov chains</subject><subject>Performance degradation</subject><subject>Pollutants</subject><subject>Pollution monitoring</subject><subject>Root-mean-square errors</subject><subject>Short term</subject><subject>Smoothing</subject><subject>Statistical analysis</subject><subject>Time series</subject><issn>0269-2821</issn><issn>1573-7462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpFkMtOwzAQRS0EEuXxA6wssXbx2ElcL1EFFKkIFrCOJs64TZUmwXbK7xMoEquRro5m7hzGbkDOQUpzF0FmhRJSgZDWWhBwwmaQGy3MlJ-ymVSFFWqh4JxdxLiTUuYq0zNWrZrNlm8CdmOLgWNX87jtQ-KJwp6nZk88Umgoct8HchhT02147_nbi5rnHJvAh75tx4Rd4oIjd_1-wICpORAPdGjo64qdeWwjXf_NS_bx-PC-XIn169Pz8n4thqloEpA5ULV3lrzXVJEuDEFdZNYjLRArqzPjbW0hr7CoHNROZoi1KnIna-cqfcluj3uH0H-OFFO568fQTSdLVSizMGCMnSh9pOIQplco_FMgyx-Z5VFmOcksf2WWoL8BbpppAg</recordid><startdate>20220201</startdate><enddate>20220201</enddate><creator>Das, Rituparna</creator><creator>Middya, Asif Iqbal</creator><creator>Roy, Sarbani</creator><general>Springer Netherlands</general><general>Springer Nature B.V</general><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PKEHL</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRQQA</scope><scope>PSYQQ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7598-8266</orcidid></search><sort><creationdate>20220201</creationdate><title>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</title><author>Das, Rituparna ; Middya, Asif Iqbal ; Roy, Sarbani</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p157t-14c12dfc9eff3ebe367e1d649fae8aab9347f9d915ba6bc1dc04aad265c0dccb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Air monitoring</topic><topic>Air pollution</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Autoregressive models</topic><topic>Comparative studies</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Deep learning</topic><topic>Forecasting</topic><topic>Horizon</topic><topic>Learning theory</topic><topic>Machine learning</topic><topic>Markov chains</topic><topic>Performance degradation</topic><topic>Pollutants</topic><topic>Pollution monitoring</topic><topic>Root-mean-square errors</topic><topic>Short term</topic><topic>Smoothing</topic><topic>Statistical analysis</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Das, Rituparna</creatorcontrib><creatorcontrib>Middya, Asif Iqbal</creatorcontrib><creatorcontrib>Roy, Sarbani</creatorcontrib><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Library & Information Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Social Sciences</collection><collection>ProQuest One Psychology</collection><collection>ProQuest Central Basic</collection><jtitle>The Artificial intelligence review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Das, Rituparna</au><au>Middya, Asif Iqbal</au><au>Roy, Sarbani</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review</atitle><jtitle>The Artificial intelligence review</jtitle><stitle>Artif Intell Rev</stitle><date>2022-02-01</date><risdate>2022</risdate><volume>55</volume><issue>2</issue><spage>1253</spage><epage>1287</epage><pages>1253-1287</pages><issn>0269-2821</issn><eissn>1573-7462</eissn><abstract>Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM 2.5 . More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.</abstract><cop>Dordrecht</cop><pub>Springer Netherlands</pub><doi>10.1007/s10462-021-09991-1</doi><tpages>35</tpages><orcidid>https://orcid.org/0000-0002-7598-8266</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0269-2821
ispartof	The Artificial intelligence review, 2022-02, Vol.55 (2), p.1253-1287
issn	0269-2821 1573-7462
language	eng
recordid	cdi_proquest_journals_2627871779
source	SpringerLink Journals
subjects	Air monitoring Air pollution Artificial Intelligence Artificial neural networks Autoregressive models Comparative studies Computer Science Datasets Decision trees Deep learning Forecasting Horizon Learning theory Machine learning Markov chains Performance degradation Pollutants Pollution monitoring Root-mean-square errors Short term Smoothing Statistical analysis Time series
title	High granular and short term time series forecasting of PM2.5 air pollutant - a comparative review
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T23%3A10%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High%20granular%20and%20short%20term%20time%20series%20forecasting%20of%20PM2.5%20air%20pollutant%20-%20a%20comparative%20review&rft.jtitle=The%20Artificial%20intelligence%20review&rft.au=Das,%20Rituparna&rft.date=2022-02-01&rft.volume=55&rft.issue=2&rft.spage=1253&rft.epage=1287&rft.pages=1253-1287&rft.issn=0269-2821&rft.eissn=1573-7462&rft_id=info:doi/10.1007/s10462-021-09991-1&rft_dat=%3Cproquest_sprin%3E2627871779%3C/proquest_sprin%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2627871779&rft_id=info:pmid/&rfr_iscdi=true