Predicting microbial extracellular electron transfer activity in paddy soils with soil physicochemical properties using machine learning

Soil extracellular electron transfer (EET) is a pivotal biological process within the realm of soil. Unfortunately, EET suffers from a lack of predictive models. Herein, an intricately crafted machine learning model has been developed for the purpose of predicting soil EET by using the physicochemic...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Science China. Technological sciences 2024, Vol.67 (1), p.259-270
Hauptverfasser: Ou, JiaJun, Luo, XiaoShan, Liu, JunYang, Huang, LinYan, Zhou, LiHua, Yuan, Yong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Soil extracellular electron transfer (EET) is a pivotal biological process within the realm of soil. Unfortunately, EET suffers from a lack of predictive models. Herein, an intricately crafted machine learning model has been developed for the purpose of predicting soil EET by using the physicochemical properties of soil as independent input variables and the EET capabilities in terms of current density ( j max ) and Coulombic charge ( C out ) as dependent output variables. An autoencoder ensemble stacking (AES) model was developed to address the aforementioned issue by integrating support vector machine, multilayer perceptron, extreme gradient boosting, and light gradient boosting machine algorithms as the stacking algorithms. With 10-fold cross-validation, the AES model exhibited notable improvements in predicting j max and C out , with average test R 2 values of 0.83 and 0.84, respectively, surpassing those of single machine learning (ML) models and the basic ensemble model. By utilizing partial correlation plots (PDPs), Shapley Additive explanations (SHAP) values, and SHAP decision plots, we quantitatively explained the impact and contribution of the input molecules on the AES model’s predictions of j max and C out . In the context of the SHAP method for the AES model, total carbon (TC) was identified as the most correlated descriptor for j max , while total organic carbon (TOC) stood out as the most relevant descriptor for C out . In the prediction tasks of j max and C out within the AES model, employing a multitask ML approach allowed the model to benefit from the shared information of input variables, thereby enhancing its overall generalizability. This study provides a feasible tool for the prediction of soil EET from soil physiochemical properties and an advanced understanding of the relationship between soil physiochemical properties and EET capability.
ISSN:1674-7321
1869-1900
DOI:10.1007/s11431-023-2537-y