Comparative analysis of machine learning models for predicting PM2.5 concentrations using meteorological and chemical indicators

Air pollution significantly impacts human health, causing numerous premature deaths, particularly with the rise in PM2.5 concentrations. Therefore, comparing different machine learning (ML) models for predicting PM2.5 concentration is crucial. This research focuses on six ML models: Linear Regressio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of atmospheric and solar-terrestrial physics 2024-10, Vol.263, p.106338, Article 106338
Hauptverfasser:	Haseeb, Muhammad, Tahir, Zainab, Mahmood, Syed Amer, Arif, Hania, Almutairi, Khalid F., Soufan, Walid, Tariq, Aqil
Format:	Artikel
Sprache:	eng
Schlagworte:	Air quality ANN GPR PM concentration prediction SVM
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Air pollution significantly impacts human health, causing numerous premature deaths, particularly with the rise in PM2.5 concentrations. Therefore, comparing different machine learning (ML) models for predicting PM2.5 concentration is crucial. This research focuses on six ML models: Linear Regression (LR), Regression Tree (RT), Support Vector Machine (SVM), Ensemble Regression (ERT), Gaussian Process Regression (GPR), and Artificial Neural Networks (ANN). Trained on six years of data (July 2015–December 2021) with optimized hyperparameters, the models consider eight meteorological and chemical indicators as PM2.5 predictors, including temperature, relative humidity, air pressure, O3, SO2, NO2, dew point, and wind speed. Model efficiency is assessed using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Correlation Coefficient (R), and Coefficient of Determination (R2) values. The models achieve R2 and RMSE values as follows: LR (0.72, 13.52), RT (0.8, 12.156), SVM (0.82, 10.28), ERT (0.81, 11.87), GPR (0.94, 7.65), and ANN (0.99, 2.36). These metrics indicate the superior performance of ANN, with its R2 value approaching 1 and the lowest RMSE compared to other models. The results highlight the effectiveness of ANN, particularly the model with three hidden layers, in predicting PM2.5 concentration. Utilizing ML models for this purpose is crucial for understanding and mitigating the impacts on human health and the environment, with ANN emerging as a promising tool for various investigations. •This research focuses on six ML models: LR, RT, SVM, ERT, GPR, and ANN.•Trained on six years of data with optimized hyperparameters.•Considered eight meteorological and chemical indicators as PM2.5 predictors.•The results of ANN with three hidden layers in predicting PM2.5 concentration.
ISSN:	1364-6826
DOI:	10.1016/j.jastp.2024.106338