Predicting crop root concentration factors of organic contaminants with machine learning models

Accurate prediction of uptake and accumulation of organic contaminants by crops from soils is essential to assessing human exposure via the food chain. However, traditional empirical or mechanistic models frequently show variable performance due to complex interactions among contaminants, soils, and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of hazardous materials 2022-02, Vol.424 (Pt B), p.127437-127437, Article 127437
Hauptverfasser: Gao, Feng, Shen, Yike, Brett Sallach, J., Li, Hui, Zhang, Wei, Li, Yuanbo, Liu, Cun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Accurate prediction of uptake and accumulation of organic contaminants by crops from soils is essential to assessing human exposure via the food chain. However, traditional empirical or mechanistic models frequently show variable performance due to complex interactions among contaminants, soils, and plants. Thus, in this study different machine learning algorithms were compared and applied to predict root concentration factors (RCFs) based on a dataset comprising 57 chemicals and 11 crops, followed by comparison with a traditional linear regression model as the benchmark. The RCF patterns and predictions were investigated by unsupervised t-distributed stochastic neighbor embedding and four supervised machine learning models including Random Forest, Gradient Boosting Regression Tree, Fully Connected Neural Network, and Supporting Vector Regression based on 15 property descriptors. The Fully Connected Neural Network demonstrated superior prediction performance for RCFs (R2 =0.79, mean absolute error [MAE] = 0.22) over other machine learning models (R2 =0.68–0.76, MAE = 0.23–0.26). All four machine learning models performed better than the traditional linear regression model (R2 =0.62, MAE = 0.29). Four key property descriptors were identified in predicting RCFs. Specifically, increasing root lipid content and decreasing soil organic matter content increased RCFs, while increasing excess molar refractivity and molecular volume of contaminants decreased RCFs. These results show that machine learning models can improve prediction accuracy by learning nonlinear relationships between RCFs and properties of contaminants, soils, and plants. [Display omitted] •FCNN model achieved the best prediction performance for RCFs.•Machine learning models performed better than traditional linear regression model.•Machine learning can identify important property descriptors for predicting RCFs.•Machine learning can learn complex relationships in contaminant-soil-plant systems.
ISSN:0304-3894
1873-3336
DOI:10.1016/j.jhazmat.2021.127437