Predicting PM2.5 levels and exceedance days using machine learning methods
Machine learning methods are increasingly being used in the field of air quality research to investigate the relationship between air pollutant levels, emissions, and meteorological changes over time. This research is used for both scientific investigation, and policy assessment and development. How...
Gespeichert in:
Veröffentlicht in: | Atmospheric environment (1994) 2024-04, Vol.323, p.120396, Article 120396 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine learning methods are increasingly being used in the field of air quality research to investigate the relationship between air pollutant levels, emissions, and meteorological changes over time. This research is used for both scientific investigation, and policy assessment and development. However, there is a lack of studies that have compared the performance of different machine learning methods. To address this gap, this paper employed various machine learning techniques, including decision tree, random forest (RF), support vector machine (SVM), support vector regression (SVR), k-nearest neighbor, neural network, and Gaussian process regression, to predict daily average PM2.5 levels and the number of days with PM2.5 exceedance in the South Coast Air Basin of California from 2000 to 2019. The models were trained using meteorological factors, estimated emissions, and large-scale climate indices as inputs. The SVR model demonstrated the highest predictive accuracy for PM2.5 levels and the SVM model gave the most accurate results for predicting the number of days with PM2.5 exceedances. Conversely, the decision tree model performed the least accurately. The results also showed that emissions have a greater impact on PM2.5 levels over time compared to meteorological factors, though meteorology is responsible for daily variability. The most important meteorological factors were identified as surface relative humidity and relative humidity at 850 mbars, which are related to partitioning, cloud cover and wet deposition. We conducted sensitivity tests on the model's response to emissions and meteorological factors. The predicted PM2.5 from RF and SVR showed large correlations with emissions at the early period (2000–2010). However, the changes were minimal in more recent years (2011–2019), implying that there are biases in machine learning models, in which the models consistently predict the minimum PM2.5 levels at a baseline.
[Display omitted]
•SVR is more accurate on daily PM2.5, particularly PM2.5 exceedances predictions.•Surface RH was the most important meteorological factor for PM2.5 prediction.•The impact of emissions on PM2.5 was significant before 2010 but reduced thereafter.•ML models predict past better than future; all the ML models are limited at extremes. |
---|---|
ISSN: | 1352-2310 1873-2844 |
DOI: | 10.1016/j.atmosenv.2024.120396 |