Statistics for sample splitting for the calibration and validation of hydrological models

Hydrological models have been widely applied in flood forecasting, water resource management and other environmental sciences. Most hydrological models calibrate and validate parameters with available records. However, the first step of hydrological simulation is always to quantitatively and objecti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Stochastic environmental research and risk assessment 2018-11, Vol.32 (11), p.3099-3116
Hauptverfasser: Liu, Dedi, Guo, Shenglian, Wang, Zhaoli, Liu, Pan, Yu, Xixuan, Zhao, Qin, Zou, Hui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Hydrological models have been widely applied in flood forecasting, water resource management and other environmental sciences. Most hydrological models calibrate and validate parameters with available records. However, the first step of hydrological simulation is always to quantitatively and objectively split samples for use in calibration and validation. In this paper, we have proposed a framework to address this issue through a combination of a hierarchical scheme through trial and error method, for systematic testing of hydrological models, and hypothesis testing to check the statistical significance of goodness-of-fit indices. That is, the framework evaluates the performance of a hydrological model using sample splitting for calibration and validation, and assesses the statistical significance of the Nash–Sutcliffe efficiency index ( E f ), which is commonly used to assess the performance of hydrological models. The sample splitting scheme used is judged as acceptable if the E f values exceed the threshold of hypothesis testing. According to the requirements of the hierarchical scheme for systematic testing of hydrological models, cross calibration and validation will help to increase the reliability of the splitting scheme, and reduce the effective range of sample sizes for both calibration and validation. It is illustrated that the threshold of E f is dependent on the significance level, evaluation criteria (both regarded as the population), distribution type, and sample size. The performance rating of E f is largely dependent on the evaluation criteria. Three types of distributions, which are based on an approximately standard normal distribution, a Chi square distribution, and a bootstrap method, are used to investigate their effects on the thresholds, with two commonly used significance levels. The highest threshold is from the bootstrap method, the middle one is from the approximately standard normal distribution, and the lowest is from the Chi square distribution. It was found that the smaller the sample size, the higher the threshold values are. Sample splitting was improved by providing more records. In addition, outliers with a large bias between the simulation and the observation can affect the sample values of E f , and hence the output of the sample splitting scheme. Physical hydrology processes and the purpose of the model should be carefully considered when assessing outliers. The proposed framework in this paper cannot guarantee the bes
ISSN:1436-3240
1436-3259
DOI:10.1007/s00477-018-1539-8