Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy

Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of soils and sediments 2024-11, Vol.24 (11), p.3668-3683
Hauptverfasser: Qi, Chongchong, Zhou, Min, Chen, Qiusong, Hu, Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract
ISSN:1439-0108
1614-7480
DOI:10.1007/s11368-024-03914-7