Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy

Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of soils and sediments 2024-11, Vol.24 (11), p.3668-3683
Hauptverfasser: Qi, Chongchong, Zhou, Min, Chen, Qiusong, Hu, Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3683
container_issue 11
container_start_page 3668
container_title Journal of soils and sediments
container_volume 24
creator Qi, Chongchong
Zhou, Min
Chen, Qiusong
Hu, Tao
description Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract
doi_str_mv 10.1007/s11368-024-03914-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3154156947</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3154156947</sourcerecordid><originalsourceid>FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</originalsourceid><addsrcrecordid>eNp9kU1v3CAQhq2qlZqm_QM9IfXSCw0Y29jHKuqXlKiX9IwGPN4QsbBhvJH23h_e2WykSDn0xMA88wh4m-ajVl-0UvaCtDbDKFXbSWUm3Un7qjnTw7HoRvWa685MUmk1vm3eEd0pZSy3z5q_NxVReiCcxRbCbcwoEkLNMW_EtsyYSCylCsy3kANDCeoGJQVIKKjEJK6zCAmI4hIDrLFk4Q8i5hU3lbdseYgUfUKZWcuNpUJlD-0wrLVQKLvD--bNAonww9N63vz5_u3m8qe8-v3j1-XXKxlaY1Y5tZPSGOziAeYw9DCjnw3YwepJ2R61Dl5Pgx-wGxFaH_zS8Rn_j29DmFtz3nw-eXe13O-RVreNFDAlyFj25IzuO90PU2cZ_fQCvSv7mvl2TBk12X4cDFPtiQr8Eqq4uF2NW6gHp5U7BuNOwTgOxj0G445qcxoihvMG67P6P1P_ABN7k7k</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3130975863</pqid></control><display><type>article</type><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><source>Springer Nature - Complete Springer Journals</source><creator>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</creator><creatorcontrib>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</creatorcontrib><description>Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract</description><identifier>ISSN: 1439-0108</identifier><identifier>EISSN: 1614-7480</identifier><identifier>DOI: 10.1007/s11368-024-03914-7</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Band spectra ; Classification ; clay ; Clay minerals ; cost effectiveness ; Earth and Environmental Science ; Environment ; Environmental Physics ; Heavy metals ; Infrared spectroscopy ; Learning algorithms ; Machine learning ; Manganese ; Near infrared radiation ; Nondestructive testing ; Organic compounds ; Organic soils ; prediction ; Predictions ; Preprocessing ; Sec 5 • Soil and Landscape Ecology • Research Article ; Soil ; Soil classification ; Soil investigations ; Soil pollution ; Soil Science &amp; Conservation ; Soils ; Spectral bands ; Spectrum analysis</subject><ispartof>Journal of soils and sediments, 2024-11, Vol.24 (11), p.3668-3683</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</cites><orcidid>0000-0001-5189-1614</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11368-024-03914-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11368-024-03914-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,778,782,27907,27908,41471,42540,51302</link.rule.ids></links><search><creatorcontrib>Qi, Chongchong</creatorcontrib><creatorcontrib>Zhou, Min</creatorcontrib><creatorcontrib>Chen, Qiusong</creatorcontrib><creatorcontrib>Hu, Tao</creatorcontrib><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><title>Journal of soils and sediments</title><addtitle>J Soils Sediments</addtitle><description>Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract</description><subject>Algorithms</subject><subject>Band spectra</subject><subject>Classification</subject><subject>clay</subject><subject>Clay minerals</subject><subject>cost effectiveness</subject><subject>Earth and Environmental Science</subject><subject>Environment</subject><subject>Environmental Physics</subject><subject>Heavy metals</subject><subject>Infrared spectroscopy</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Manganese</subject><subject>Near infrared radiation</subject><subject>Nondestructive testing</subject><subject>Organic compounds</subject><subject>Organic soils</subject><subject>prediction</subject><subject>Predictions</subject><subject>Preprocessing</subject><subject>Sec 5 • Soil and Landscape Ecology • Research Article</subject><subject>Soil</subject><subject>Soil classification</subject><subject>Soil investigations</subject><subject>Soil pollution</subject><subject>Soil Science &amp; Conservation</subject><subject>Soils</subject><subject>Spectral bands</subject><subject>Spectrum analysis</subject><issn>1439-0108</issn><issn>1614-7480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kU1v3CAQhq2qlZqm_QM9IfXSCw0Y29jHKuqXlKiX9IwGPN4QsbBhvJH23h_e2WykSDn0xMA88wh4m-ajVl-0UvaCtDbDKFXbSWUm3Un7qjnTw7HoRvWa685MUmk1vm3eEd0pZSy3z5q_NxVReiCcxRbCbcwoEkLNMW_EtsyYSCylCsy3kANDCeoGJQVIKKjEJK6zCAmI4hIDrLFk4Q8i5hU3lbdseYgUfUKZWcuNpUJlD-0wrLVQKLvD--bNAonww9N63vz5_u3m8qe8-v3j1-XXKxlaY1Y5tZPSGOziAeYw9DCjnw3YwepJ2R61Dl5Pgx-wGxFaH_zS8Rn_j29DmFtz3nw-eXe13O-RVreNFDAlyFj25IzuO90PU2cZ_fQCvSv7mvl2TBk12X4cDFPtiQr8Eqq4uF2NW6gHp5U7BuNOwTgOxj0G445qcxoihvMG67P6P1P_ABN7k7k</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Qi, Chongchong</creator><creator>Zhou, Min</creator><creator>Chen, Qiusong</creator><creator>Hu, Tao</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7ST</scope><scope>7UA</scope><scope>C1K</scope><scope>F1W</scope><scope>H96</scope><scope>H97</scope><scope>L.G</scope><scope>SOI</scope><scope>7S9</scope><scope>L.6</scope><orcidid>https://orcid.org/0000-0001-5189-1614</orcidid></search><sort><creationdate>20241101</creationdate><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><author>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Band spectra</topic><topic>Classification</topic><topic>clay</topic><topic>Clay minerals</topic><topic>cost effectiveness</topic><topic>Earth and Environmental Science</topic><topic>Environment</topic><topic>Environmental Physics</topic><topic>Heavy metals</topic><topic>Infrared spectroscopy</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Manganese</topic><topic>Near infrared radiation</topic><topic>Nondestructive testing</topic><topic>Organic compounds</topic><topic>Organic soils</topic><topic>prediction</topic><topic>Predictions</topic><topic>Preprocessing</topic><topic>Sec 5 • Soil and Landscape Ecology • Research Article</topic><topic>Soil</topic><topic>Soil classification</topic><topic>Soil investigations</topic><topic>Soil pollution</topic><topic>Soil Science &amp; Conservation</topic><topic>Soils</topic><topic>Spectral bands</topic><topic>Spectrum analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qi, Chongchong</creatorcontrib><creatorcontrib>Zhou, Min</creatorcontrib><creatorcontrib>Chen, Qiusong</creatorcontrib><creatorcontrib>Hu, Tao</creatorcontrib><collection>CrossRef</collection><collection>Environment Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 3: Aquatic Pollution &amp; Environmental Quality</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Environment Abstracts</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><jtitle>Journal of soils and sediments</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qi, Chongchong</au><au>Zhou, Min</au><au>Chen, Qiusong</au><au>Hu, Tao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</atitle><jtitle>Journal of soils and sediments</jtitle><stitle>J Soils Sediments</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>24</volume><issue>11</issue><spage>3668</spage><epage>3683</epage><pages>3668-3683</pages><issn>1439-0108</issn><eissn>1614-7480</eissn><abstract>Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management. Materials and methods Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model. Results and discussions The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds. Conclusions This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations. Graphical abstract</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s11368-024-03914-7</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-5189-1614</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1439-0108
ispartof Journal of soils and sediments, 2024-11, Vol.24 (11), p.3668-3683
issn 1439-0108
1614-7480
language eng
recordid cdi_proquest_miscellaneous_3154156947
source Springer Nature - Complete Springer Journals
subjects Algorithms
Band spectra
Classification
clay
Clay minerals
cost effectiveness
Earth and Environmental Science
Environment
Environmental Physics
Heavy metals
Infrared spectroscopy
Learning algorithms
Machine learning
Manganese
Near infrared radiation
Nondestructive testing
Organic compounds
Organic soils
prediction
Predictions
Preprocessing
Sec 5 • Soil and Landscape Ecology • Research Article
Soil
Soil classification
Soil investigations
Soil pollution
Soil Science & Conservation
Soils
Spectral bands
Spectrum analysis
title Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T12%3A05%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Tree-based%20machine%20learning%20models%20for%20enhanced%20large-scale%20soil%20Mn%20classification%20by%20integrating%20visible-near%20infrared%20spectroscopy&rft.jtitle=Journal%20of%20soils%20and%20sediments&rft.au=Qi,%20Chongchong&rft.date=2024-11-01&rft.volume=24&rft.issue=11&rft.spage=3668&rft.epage=3683&rft.pages=3668-3683&rft.issn=1439-0108&rft.eissn=1614-7480&rft_id=info:doi/10.1007/s11368-024-03914-7&rft_dat=%3Cproquest_cross%3E3154156947%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3130975863&rft_id=info:pmid/&rfr_iscdi=true