A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection

Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Science of the total environment 2020-05, Vol.715, p.136991-136991, Article 136991
Hauptverfasser: Masmoudi, Sahar, Elghazel, Haytham, Taieb, Dalila, Yazar, Orhan, Kallel, Amjad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 136991
container_issue
container_start_page 136991
container_title The Science of the total environment
container_volume 715
creator Masmoudi, Sahar
Elghazel, Haytham
Taieb, Dalila
Yazar, Orhan
Kallel, Amjad
description Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values. [Display omitted] •Forecasting multiple air pollutant concentrations simultaneously.•The combination of Multi-Target Regression method and the Random Forest paradigm.•The proposed method ensures better performance in air quality forecast.
doi_str_mv 10.1016/j.scitotenv.2020.136991
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2353584101</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0048969720305015</els_id><sourcerecordid>2353584101</sourcerecordid><originalsourceid>FETCH-LOGICAL-c371t-a8c7896cad38b8867e1e8c41e97f011991d99affaf55ac97bc689c46c9d8b5b03</originalsourceid><addsrcrecordid>eNqFkc9u1DAQxi0Eots_rwC-wSVbO9nE9nFVFahUiQucrYkzWbw49mI7i3gA3ruOUnrFF0sz3zej-X2EvOdsyxnvbo_bZGwOGf15W7O6VJtOKf6KbLgUquKs7l6TDWM7WalOiQtymdKRlSckf0sumprtOBNqQ_7u6QTmh_VYOYTorT_QMcKEv0P8SccQ6SniYE1eGtPssj05pGBLPTg3Z_A5faAmeIM-R8g2-ETPFlZtlSEeMNOIh4gplSYFP9ARIc8RaUKHZrFckzcjuIQ3z_8V-f7p_tvdl-rx6-eHu_1jZRrBcwXSCKk6A0Mjeyk7gRyl2XFUYmScFwCDUjCOMLYtGCV600lldp1Rg-zbnjVX5OM69xTDrxlT1pNNBp0Dj2FOum7appUFDS9SsUpNDClFHPUp2gniH82ZXjLQR_2SgV4y0GsGxfnuecncTzi8-P5BL4L9KsBy6tliXAZhITjYWHjoIdj_LnkCE_GgzA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2353584101</pqid></control><display><type>article</type><title>A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Masmoudi, Sahar ; Elghazel, Haytham ; Taieb, Dalila ; Yazar, Orhan ; Kallel, Amjad</creator><creatorcontrib>Masmoudi, Sahar ; Elghazel, Haytham ; Taieb, Dalila ; Yazar, Orhan ; Kallel, Amjad</creatorcontrib><description>Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values. [Display omitted] •Forecasting multiple air pollutant concentrations simultaneously.•The combination of Multi-Target Regression method and the Random Forest paradigm.•The proposed method ensures better performance in air quality forecast.</description><identifier>ISSN: 0048-9697</identifier><identifier>EISSN: 1879-1026</identifier><identifier>DOI: 10.1016/j.scitotenv.2020.136991</identifier><identifier>PMID: 32041079</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Air pollution ; Feature ranking ; Forecasting ; Machine learning ; Multi-target regression (MTR)</subject><ispartof>The Science of the total environment, 2020-05, Vol.715, p.136991-136991, Article 136991</ispartof><rights>2020 Elsevier B.V.</rights><rights>Copyright © 2020 Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c371t-a8c7896cad38b8867e1e8c41e97f011991d99affaf55ac97bc689c46c9d8b5b03</citedby><cites>FETCH-LOGICAL-c371t-a8c7896cad38b8867e1e8c41e97f011991d99affaf55ac97bc689c46c9d8b5b03</cites><orcidid>0000-0003-0167-0228</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.scitotenv.2020.136991$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32041079$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Masmoudi, Sahar</creatorcontrib><creatorcontrib>Elghazel, Haytham</creatorcontrib><creatorcontrib>Taieb, Dalila</creatorcontrib><creatorcontrib>Yazar, Orhan</creatorcontrib><creatorcontrib>Kallel, Amjad</creatorcontrib><title>A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection</title><title>The Science of the total environment</title><addtitle>Sci Total Environ</addtitle><description>Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values. [Display omitted] •Forecasting multiple air pollutant concentrations simultaneously.•The combination of Multi-Target Regression method and the Random Forest paradigm.•The proposed method ensures better performance in air quality forecast.</description><subject>Air pollution</subject><subject>Feature ranking</subject><subject>Forecasting</subject><subject>Machine learning</subject><subject>Multi-target regression (MTR)</subject><issn>0048-9697</issn><issn>1879-1026</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkc9u1DAQxi0Eots_rwC-wSVbO9nE9nFVFahUiQucrYkzWbw49mI7i3gA3ruOUnrFF0sz3zej-X2EvOdsyxnvbo_bZGwOGf15W7O6VJtOKf6KbLgUquKs7l6TDWM7WalOiQtymdKRlSckf0sumprtOBNqQ_7u6QTmh_VYOYTorT_QMcKEv0P8SccQ6SniYE1eGtPssj05pGBLPTg3Z_A5faAmeIM-R8g2-ETPFlZtlSEeMNOIh4gplSYFP9ARIc8RaUKHZrFckzcjuIQ3z_8V-f7p_tvdl-rx6-eHu_1jZRrBcwXSCKk6A0Mjeyk7gRyl2XFUYmScFwCDUjCOMLYtGCV600lldp1Rg-zbnjVX5OM69xTDrxlT1pNNBp0Dj2FOum7appUFDS9SsUpNDClFHPUp2gniH82ZXjLQR_2SgV4y0GsGxfnuecncTzi8-P5BL4L9KsBy6tliXAZhITjYWHjoIdj_LnkCE_GgzA</recordid><startdate>20200501</startdate><enddate>20200501</enddate><creator>Masmoudi, Sahar</creator><creator>Elghazel, Haytham</creator><creator>Taieb, Dalila</creator><creator>Yazar, Orhan</creator><creator>Kallel, Amjad</creator><general>Elsevier B.V</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0167-0228</orcidid></search><sort><creationdate>20200501</creationdate><title>A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection</title><author>Masmoudi, Sahar ; Elghazel, Haytham ; Taieb, Dalila ; Yazar, Orhan ; Kallel, Amjad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c371t-a8c7896cad38b8867e1e8c41e97f011991d99affaf55ac97bc689c46c9d8b5b03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Air pollution</topic><topic>Feature ranking</topic><topic>Forecasting</topic><topic>Machine learning</topic><topic>Multi-target regression (MTR)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Masmoudi, Sahar</creatorcontrib><creatorcontrib>Elghazel, Haytham</creatorcontrib><creatorcontrib>Taieb, Dalila</creatorcontrib><creatorcontrib>Yazar, Orhan</creatorcontrib><creatorcontrib>Kallel, Amjad</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>The Science of the total environment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Masmoudi, Sahar</au><au>Elghazel, Haytham</au><au>Taieb, Dalila</au><au>Yazar, Orhan</au><au>Kallel, Amjad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection</atitle><jtitle>The Science of the total environment</jtitle><addtitle>Sci Total Environ</addtitle><date>2020-05-01</date><risdate>2020</risdate><volume>715</volume><spage>136991</spage><epage>136991</epage><pages>136991-136991</pages><artnum>136991</artnum><issn>0048-9697</issn><eissn>1879-1026</eissn><abstract>Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values. [Display omitted] •Forecasting multiple air pollutant concentrations simultaneously.•The combination of Multi-Target Regression method and the Random Forest paradigm.•The proposed method ensures better performance in air quality forecast.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>32041079</pmid><doi>10.1016/j.scitotenv.2020.136991</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-0167-0228</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0048-9697
ispartof The Science of the total environment, 2020-05, Vol.715, p.136991-136991, Article 136991
issn 0048-9697
1879-1026
language eng
recordid cdi_proquest_miscellaneous_2353584101
source Elsevier ScienceDirect Journals Complete
subjects Air pollution
Feature ranking
Forecasting
Machine learning
Multi-target regression (MTR)
title A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T11%3A54%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20machine-learning%20framework%20for%20predicting%20multiple%20air%20pollutants'%20concentrations%20via%20multi-target%20regression%20and%20feature%20selection&rft.jtitle=The%20Science%20of%20the%20total%20environment&rft.au=Masmoudi,%20Sahar&rft.date=2020-05-01&rft.volume=715&rft.spage=136991&rft.epage=136991&rft.pages=136991-136991&rft.artnum=136991&rft.issn=0048-9697&rft.eissn=1879-1026&rft_id=info:doi/10.1016/j.scitotenv.2020.136991&rft_dat=%3Cproquest_cross%3E2353584101%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2353584101&rft_id=info:pmid/32041079&rft_els_id=S0048969720305015&rfr_iscdi=true