Machine learning models with innovative outlier detection techniques for predicting heavy metal contamination in soils
Machine learning (ML) models for accurately predicting heavy metals with inconsistent outputs have improved owing to dataset outliers, which influence model reliability and accuracy. A comprehensive technique that combines machine learning and advanced statistical methods was applied to assess data...
Gespeichert in:
Veröffentlicht in: | Journal of hazardous materials 2025-01, Vol.481, p.136536, Article 136536 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine learning (ML) models for accurately predicting heavy metals with inconsistent outputs have improved owing to dataset outliers, which influence model reliability and accuracy. A comprehensive technique that combines machine learning and advanced statistical methods was applied to assess data outlier’s effects on ML models. Ten ML models with three outlier detection methods predicted Cr, Ni, Cd, and Pb in Narayanganj soils. XGBoost with density-based spatial clustering of applications with noise (DBSCAN) improved model efficacy (R2). The R2 of Cr, Ni, Cd, and Pb was considerably enhanced by 11.11 %, 6.33 %, 14.47 %, and 5.68 %, respectively, indicating that outliers affected the model's HM prediction. Soil factors affected Cr (80 %), Ni (72.61 %), Cd (53.35 %), and Pb (63.47 %) concentrations based on feature importance. Contamination factor prediction showed considerable contamination for Cr, Ni, and Cd. LISA revealed Cd (55.4 %), Cr (49.3 %), and Pb (47.3 %) as the significant pollutant (p |
---|---|
ISSN: | 0304-3894 1873-3336 1873-3336 |
DOI: | 10.1016/j.jhazmat.2024.136536 |