Random forest-based modeling of stream nutrients at national level in a data-scarce region

Nutrient runoff from agricultural production is one of the main causes of water quality deterioration in river systems and coastal waters. Water quality modeling can be used for gaining insight into water quality issues in order to implement effective mitigation efforts. Process-based nutrient model...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Science of the total environment 2022-09, Vol.840, p.156613-156613, Article 156613
Hauptverfasser: Virro, Holger, Kmoch, Alexander, Vainu, Marko, Uuemaa, Evelyn
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nutrient runoff from agricultural production is one of the main causes of water quality deterioration in river systems and coastal waters. Water quality modeling can be used for gaining insight into water quality issues in order to implement effective mitigation efforts. Process-based nutrient models are very complex, requiring a lot of input parameters and computationally expensive calibration. Recently, ML approaches have shown to achieve an accuracy comparable to the process-based models and even outperform them when describing nonlinear relationships. We used observations from 242 Estonian catchments, amounting to 469 yearly TN and 470 TP measurements covering the period 2016–2020 to train random forest (RF) models for predicting annual N and P concentrations. We used a total of 82 predictor variables, including land cover, soil, climate and topography parameters and applied a feature selection strategy to reduce the number of dependent features in the models. The SHAP method was used for deriving the most relevant predictors. The performance of our models is comparable to previous process-based models used in the Baltic region with the TN and TP model having an R2 score of 0.83 and 0.52, respectively. However, as input data used in our models is easier to obtain, the models offer superior applicability in areas, where data availability is insufficient for process-based approaches. Therefore, the models enable to give a robust estimation for nutrient losses at national level and allows to capture the spatial variability of the nutrient runoff which in turn enables to provide decision-making support for regional water management plans. [Display omitted] •We created RF models for predicting annual TN and TP concentrations in Estonia.•The models achieved a performance comparable to existing process-based approaches.•Data used as predictors in RF models is easier to obtain than in process-based models.•The models offer superior scalability and reusability to process-based ones.
ISSN:0048-9697
1879-1026
DOI:10.1016/j.scitotenv.2022.156613