Improving groundwater nitrate concentration prediction using local ensemble of machine learning models

Groundwater is one of the most important water resources around the world, which is increasingly exposed to contamination. As nitrate is a common pollutant of groundwater and has negative effects on human health, predicting its concentration is of particular importance. Ensemble machine learning (ML...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of environmental management 2023-11, Vol.345, p.118782-118782, Article 118782
Hauptverfasser: Mahboobi, Hojjatollah, Shakiba, Alireza, Mirbagheri, Babak
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Groundwater is one of the most important water resources around the world, which is increasingly exposed to contamination. As nitrate is a common pollutant of groundwater and has negative effects on human health, predicting its concentration is of particular importance. Ensemble machine learning (ML) algorithms have been widely employed for nitrate concentration prediction in groundwater. However, existing ensemble models often overlook spatial variation by combining ML models with conventional methods like averaging. The objective of this study is to enhance the spatial accuracy of groundwater nitrate concentration prediction by integrating the outputs of ML models using a local approach that accounts for spatial variation. Initially, three widely used ML models including random forest regression (RFR), k-nearest neighbor (KNN), and support vector regression (SVR) were employed to predict groundwater nitrate concentration of Qom aquifer in Iran. Subsequently, the output of these models were integrated using geographically weighted regression (GWR) as a local model. The findings demonstrated that the ensemble of ML models using GWR resulted in the highest performance (R2 = 0.75 and RMSE = 9.38 mg/l) compared to an ensemble model using averaging (R2 = 0.68 and RMSE = 10.56 mg/l), as well as individual models such as RFR (R2 = 0.70 and RMSE = 10.16 mg/l), SVR (R2 = 0.59 and RMSE = 11.95 mg/l), and KNN (R2 = 0.57 and RMSE = 12.19 mg/l). The resulting prediction map revealed that groundwater nitrate contamination is predominantly concentrated in urban areas located in the northwestern regions of the study area. The insights gained from this study have practical implications for managers, assisting them in preventing nitrate pollution in groundwater and formulating strategies to improve water quality. •Groundwater nitrate concentration was estimated using heterogeneous ensemble method.•Three machine learning models including RFR, KNN, and SVR were used for ensembling.•Ensemble of machine learning models was implemented using averaging and GWR.•Ensemble by GWR performed better than ensemble by averaging model.•Using GWR for ensembling can improve the performance of spatial prediction.
ISSN:0301-4797
1095-8630
DOI:10.1016/j.jenvman.2023.118782