A multi-resolution ensemble model of three decision-tree-based algorithms to predict daily NO 2 concentration in France 2005-2022
Understanding and managing the health effects of Nitrogen Dioxide (NO ) requires high resolution spatiotemporal exposure maps. Here, we developed a multi-stage multi-resolution ensemble model that predicts daily NO concentration across continental France from 2005 to 2022. Innovations of this work i...
Gespeichert in:
Veröffentlicht in: | Environmental research 2024-05, p.119241 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Understanding and managing the health effects of Nitrogen Dioxide (NO
) requires high resolution spatiotemporal exposure maps. Here, we developed a multi-stage multi-resolution ensemble model that predicts daily NO
concentration across continental France from 2005 to 2022. Innovations of this work include the computation of daily predictions at a 200m resolution in large urban areas and the use of a spatio-temporal blocking procedure to avoid data leakage and ensure fair performance estimation. Predictions were obtained after three cascading stages of modeling: (1) predicting NO
total column density from Ozone Monitoring Instrument satellite; (2) predicting daily NO
concentrations at a 1km spatial resolution using a large set of potential predictors such as predictions obtained from stage 1, land-cover and road traffic data; and (3) predicting residuals from stage 2 models at a 200m resolution in large urban areas. The latter two stages used a generalized additive model to ensemble predictions of three decision-tree algorithms (random forest, extreme gradient boosting and categorical boosting). Cross-validated performances of our ensemble models were overall very good, with a ten-fold cross-validated R
for the 1 km model of 0.83, and of 0.69 for the 200 m model. All three basis learners participated in the ensemble predictions to various degrees depending on time and space. In sum, our multi-stage approach was able to predict daily NO
concentrations with a relatively low error. Ensembling the predictions maximizes the chance of obtaining accurate values if one basis learner fails in a specific area or at a particular time, by relying on the other learners. To the best of our knowledge, this is the first study aiming to predict NO
concentrations in France with such a high spatiotemporal resolution, large spatial extent, and long temporal coverage. Exposure estimates are available to investigate NO
health effects in epidemiological studies. |
---|---|
ISSN: | 1096-0953 |
DOI: | 10.1016/j.envres.2024.119241 |