Predictive modeling and analysis of key drivers of groundwater nitrate pollution based on machine learning

•The correlation of groundwater chemical parameters was analyzed based on SOM-Spearman method.•The model inputs were determined using SOM-Spearman method.•The PCR model was more interpretable, while the RBF ANN model was more accurate.•Nitrification, denitrification and DNRA were the internal factor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of hydrology (Amsterdam) 2023-09, Vol.624, p.129934, Article 129934
Hauptverfasser: Deng, Yuandong, Ye, Xueyan, Du, Xinqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•The correlation of groundwater chemical parameters was analyzed based on SOM-Spearman method.•The model inputs were determined using SOM-Spearman method.•The PCR model was more interpretable, while the RBF ANN model was more accurate.•Nitrification, denitrification and DNRA were the internal factors driving the change of nitrate concentration.•Human activities, rainfall, and land use were external factors driving the change in nitrate concentration. Nitrate comtamination of shallow groundwater in agricultural intensification regions is a prevalent and global environment issue affecting food security, human health, and the water ecology. Developing a prediction model for groundwater nitrate contamination is crucial for protection of groundwater resources. Machine learning modeling offer potentials to predict contamination, but often fails to adequately screen input features in complex non-linear environments before modeling. Additionally, there is a tendency to overlook the interpretable description of the relationship between major water chemistry parameters and nitrate concentration after modeling, which hinders scientific decision-making by water resource managers. In this study, a dataset consisting of hydrochemical test results from 316 groundwater samples collected between 2011 and 2015 in intensive agricultural areas of Northeast China was collected. Prior to modeling, the dataset was grouped based on land use type, vadose zones types, and thickness. Based on the grouping of data sets, self-organizing map (SOM) and Spearman's coefficients were employed to identity the correlations between water chemical parameters and nitrate concentration (NO3–-N), which provided a logical basis for selecting the key input variables. A radial basis function artificial neural network (RBF ANN) prediction model and principal components regression (PCR) models was constructed using the dataset, and particle swarm optimization algorithm was applied to determine optimal parameter combinations of the RBF ANN. After conducting modeling, the advantages and disadvantages of RBF ANN and principal components regression (PCR) models were thoroughly examined and discussed, and the primary factors that impact nitrate concentrations in groundwater were analyzed using a PCR model. The results revealed that the RBF ANN model showed greater accuracy, while the PCR model offered better interpretability. Therefore, the integration of the two models is advantageous for nitrate predicti
ISSN:0022-1694
1879-2707
DOI:10.1016/j.jhydrol.2023.129934