Data from: Evaluation of parametric and nonparametric machine-learning techniques for prediction of saturated and near-saturated hydraulic conductivity
Parametric and nonparametric supervised machine learning techniques were used to estimate saturated and near saturated hydraulic conductivities (Ks, K10) from easily measurable soil properties including name of pedological horizon (HOR), soil texture (sand, silt & clay), organic matter (OM), bul...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Parametric and nonparametric supervised machine learning techniques were
used to estimate saturated and near saturated hydraulic conductivities
(Ks, K10) from easily measurable soil properties including name of
pedological horizon (HOR), soil texture (sand, silt & clay),
organic matter (OM), bulk density (BD) and water contents (θpF1, θpF2,
θpF3 and, θpF4.2) measured at four different matric heads (-10, -100,
-1000, and -15848 cm). Using a stepwise linear model (SWLM) and the Lasso
regression as parametric methods with 316 data in training and 135 data in
testing phase, four pedotransfer functions (PTFs) were obtained in which
water contents for both methods play an important role compared to other
variables. SWLM showed better performance than Lasso in the testing phase
for log(Ks) and log(K10) prediction with RMSE of 0.666 and 0.551 cm d-1
and R2 of 0.26 and 0.65. Nonparametric supervised machine learning methods
trained and tested with similar data set significantly improved the
accuracy of Ks prediction with R2 of 0.52, 0.36 and 0.53 for Gaussian
regression process (GPR), support vector machine (SVM) and Ensemble (ENS)
method in the testing stage. These methods also described 74.9, 66.7 and
72.5% of the variation of log(K10). Bootstrapping method validated the
strong performance of nonparametric techniques. Feature selection
capability of GPR determined that instead of using a model with all
predictors, HOR, silt, θpF1 and θpF3 are sufficient for the prediction of
log(Ks) or log(K10), HOR, silt, and OM can predict as accurate as the
comprehensive model with all variables. |
---|---|
DOI: | 10.5061/dryad.ph0b6k8 |