Predicting redox‐sensitive contaminant concentrations in groundwater using random forest classification

Machine learning techniques were applied to a large (n > 10,000) compliance monitoring database to predict the occurrence of several redox‐active constituents in groundwater across a large watershed. Specifically, random forest classification was used to determine the probabilities of detecting e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Water resources research 2017-08, Vol.53 (8), p.7316-7331
Hauptverfasser: Tesoriero, Anthony J., Gronberg, Jo Ann, Juckem, Paul F., Miller, Matthew P., Austin, Brian P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Machine learning techniques were applied to a large (n > 10,000) compliance monitoring database to predict the occurrence of several redox‐active constituents in groundwater across a large watershed. Specifically, random forest classification was used to determine the probabilities of detecting elevated concentrations of nitrate, iron, and arsenic in the Fox, Wolf, Peshtigo, and surrounding watersheds in northeastern Wisconsin. Random forest classification is well suited to describe the nonlinear relationships observed among several explanatory variables and the predicted probabilities of elevated concentrations of nitrate, iron, and arsenic. Maps of the probability of elevated nitrate, iron, and arsenic can be used to assess groundwater vulnerability and the vulnerability of streams to contaminants derived from groundwater. Processes responsible for elevated concentrations are elucidated using partial dependence plots. For example, an increase in the probability of elevated iron and arsenic occurred when well depths coincided with the glacial/bedrock interface, suggesting a bedrock source for these constituents. Furthermore, groundwater in contact with Ordovician bedrock has a higher likelihood of elevated iron concentrations, which supports the hypothesis that groundwater liberates iron from a sulfide‐bearing secondary cement horizon of Ordovician age. Application of machine learning techniques to existing compliance monitoring data offers an opportunity to broadly assess aquifer and stream vulnerability at regional and national scales and to better understand geochemical processes responsible for observed conditions. Key Points Random forest classification is well suited for predicting elevated concentrations of nitrate, iron, and arsenic in groundwater Processes responsible for contaminant occurrence have been elucidated using partial dependence plots Application of random forest classification to existing compliance monitoring data can produce aquifer and stream vulnerability assessments
ISSN:0043-1397
1944-7973
DOI:10.1002/2016WR020197