Parametric versus nonparametric machine learning modelling for conditional density estimation of natural events: Application to harmful algal blooms

•Climate or physiographic patterns often induce non-stationarity for extreme values.•pML methods require subjective decisions which could lead to some biases.•npML methods are compared to commonly used pML method to estimate conditional density.•npML should be prioritized if there is no a priori kno...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ecological modelling 2023-08, Vol.482, p.110415, Article 110415
Hauptverfasser: Ratté-Fortin, Claudie, Plante, Jean-François, Rousseau, Alain N., Chokmani, Karem
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Climate or physiographic patterns often induce non-stationarity for extreme values.•pML methods require subjective decisions which could lead to some biases.•npML methods are compared to commonly used pML method to estimate conditional density.•npML should be prioritized if there is no a priori knowledge of the non-stationary form.•Random forest for CDE is a promising tool when fitted with several covariates. Besides the complex effect of global warming on extreme events, spatiotemporal variability of natural phenomena often carries the legacy of anthropogenic activities. Moreover, any feedback induced by these activities on climate brings additional complexity when modelling natural events. For extreme values, climate or physiographic patterns often induce non stationarity, or long-term changes. In this context, parametric models may become inadequate given the complexity of the studied phenomena and their systematic changes through space and time. In this paper, we assess the use and ensuing efficiency of nonparametric machine learning (npML) methods to estimate and predict extreme values associated with natural events. These npML methods are compared to a commonly used parametric machine learning (pML) approach, the nonstationary frequency analysis model. We use a historical database compiling the frequency of harmful algal blooms (HAB) in Québec, Canada. Results show that a 19-covariate RFCDE model leads to the best mean estimate among the considered models. However, for low and large quantiles, the 4-covariate RCDE model provides better agreement between observed and simulated bloom frequencies. The models may be used to assess the effects of climate change and anthropogenic developments on the frequency of HAB. They may also be leveraged to measure the efficiency of mitigation scenarios and to identify priority areas for restoration plan strategies. Recommendations are finally made regarding the estimation of the conditional density to predict extreme values associated with natural events.
ISSN:0304-3800
1872-7026
DOI:10.1016/j.ecolmodel.2023.110415