Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest

•Gradient boosting regression and random forest are combined to analyse net ecosystem carbon exchange.•The model considers 22 environmental variables, more than other similar works.•The extremums are employed to analyse the corresponding variables’ importance. Carbon balance is essential to keep eco...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied energy 2020-03, Vol.262, p.114566, Article 114566
Hauptverfasser: Cai, Jianchao, Xu, Kai, Zhu, Yanhui, Hu, Fang, Li, Liuhuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Gradient boosting regression and random forest are combined to analyse net ecosystem carbon exchange.•The model considers 22 environmental variables, more than other similar works.•The extremums are employed to analyse the corresponding variables’ importance. Carbon balance is essential to keep ecosystems sustainable and healthy. Net ecosystem carbon exchange (NEE), which is affected by a bunch of meteorological variables to different extent, helps to gauge the balance of the carbon cycle between biological organisms and atmosphere. In this study, the NEE data is collected from two flux measuring sites. Gradient boosting regression algorithm is employed to predict NEE based on the meteorology and flux data from site UK-Gri. During the training process, KFold cross-validation algorithm is implemented to avoid overfitting, and random forest algorithm is implemented to identify the important variables influencing NEE mostly. The four most important variables are found to be global radiation, photosynthetic active radiation, minimum soil temperature, and latent heat. The regression model was compared with three state-of-the-art prediction models: support vector machine, stochastic gradient descent, and bayesian ridge to verify its performance. The experimental results show that this regression model outperforms the other three models, and gives higher value of R-squared, lower values of mean absolute error and root mean squared error. To verify the regression model’s generalization ability, the data from the second flux site, NL-Loo, was employed, and the hybrid data of the two sites was used. The results show that this model performs well on the hybrid data, too. In practical terms, the gradient boosting regression model provides many tunable hypterparameters and loss functions, which make it more flexible and accurate compared to the other three models. This study has conclusively demonstrated for the first time that the combination of gradient boosting regression and random forest models should be considered as valuable tools to make effective prediction for NEE and acquire reliable important variables influencing NEE mostly. The methodologies could be useful in the research fields of ecosystem stability evaluation, environmental restoration, trend analysis of climate change, and global warming monitoring.
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2020.114566