Machine learning models with innovative outlier detection techniques for predicting heavy metal contamination in soils

Machine learning (ML) models for accurately predicting heavy metals with inconsistent outputs have improved owing to dataset outliers, which influence model reliability and accuracy. A comprehensive technique that combines machine learning and advanced statistical methods was applied to assess data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of hazardous materials 2025-01, Vol.481, p.136536, Article 136536
Hauptverfasser: Proshad, Ram, Asharaful Abedin Asha, S.M., Tan, Rong, Lu, Yineng, Abedin, Md Anwarul, Ding, Zihao, Zhang, Shuangting, Li, Ziyi, Chen, Geng, Zhao, Zhuanjun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Machine learning (ML) models for accurately predicting heavy metals with inconsistent outputs have improved owing to dataset outliers, which influence model reliability and accuracy. A comprehensive technique that combines machine learning and advanced statistical methods was applied to assess data outlier’s effects on ML models. Ten ML models with three outlier detection methods predicted Cr, Ni, Cd, and Pb in Narayanganj soils. XGBoost with density-based spatial clustering of applications with noise (DBSCAN) improved model efficacy (R2). The R2 of Cr, Ni, Cd, and Pb was considerably enhanced by 11.11 %, 6.33 %, 14.47 %, and 5.68 %, respectively, indicating that outliers affected the model's HM prediction. Soil factors affected Cr (80 %), Ni (72.61 %), Cd (53.35 %), and Pb (63.47 %) concentrations based on feature importance. Contamination factor prediction showed considerable contamination for Cr, Ni, and Cd. LISA revealed Cd (55.4 %), Cr (49.3 %), and Pb (47.3 %) as the significant pollutant (p 
ISSN:0304-3894
1873-3336
1873-3336
DOI:10.1016/j.jhazmat.2024.136536