Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy
Purpose Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life pro...
Gespeichert in:
Veröffentlicht in: | Journal of soils and sediments 2024-11, Vol.24 (11), p.3668-3683 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3683 |
---|---|
container_issue | 11 |
container_start_page | 3668 |
container_title | Journal of soils and sediments |
container_volume | 24 |
creator | Qi, Chongchong Zhou, Min Chen, Qiusong Hu, Tao |
description | Purpose
Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management.
Materials and methods
Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model.
Results and discussions
The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds.
Conclusions
This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations.
Graphical abstract |
doi_str_mv | 10.1007/s11368-024-03914-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3154156947</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3154156947</sourcerecordid><originalsourceid>FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</originalsourceid><addsrcrecordid>eNp9kU1v3CAQhq2qlZqm_QM9IfXSCw0Y29jHKuqXlKiX9IwGPN4QsbBhvJH23h_e2WykSDn0xMA88wh4m-ajVl-0UvaCtDbDKFXbSWUm3Un7qjnTw7HoRvWa685MUmk1vm3eEd0pZSy3z5q_NxVReiCcxRbCbcwoEkLNMW_EtsyYSCylCsy3kANDCeoGJQVIKKjEJK6zCAmI4hIDrLFk4Q8i5hU3lbdseYgUfUKZWcuNpUJlD-0wrLVQKLvD--bNAonww9N63vz5_u3m8qe8-v3j1-XXKxlaY1Y5tZPSGOziAeYw9DCjnw3YwepJ2R61Dl5Pgx-wGxFaH_zS8Rn_j29DmFtz3nw-eXe13O-RVreNFDAlyFj25IzuO90PU2cZ_fQCvSv7mvl2TBk12X4cDFPtiQr8Eqq4uF2NW6gHp5U7BuNOwTgOxj0G445qcxoihvMG67P6P1P_ABN7k7k</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3130975863</pqid></control><display><type>article</type><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><source>Springer Nature - Complete Springer Journals</source><creator>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</creator><creatorcontrib>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</creatorcontrib><description>Purpose
Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management.
Materials and methods
Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model.
Results and discussions
The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds.
Conclusions
This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations.
Graphical abstract</description><identifier>ISSN: 1439-0108</identifier><identifier>EISSN: 1614-7480</identifier><identifier>DOI: 10.1007/s11368-024-03914-7</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Algorithms ; Band spectra ; Classification ; clay ; Clay minerals ; cost effectiveness ; Earth and Environmental Science ; Environment ; Environmental Physics ; Heavy metals ; Infrared spectroscopy ; Learning algorithms ; Machine learning ; Manganese ; Near infrared radiation ; Nondestructive testing ; Organic compounds ; Organic soils ; prediction ; Predictions ; Preprocessing ; Sec 5 • Soil and Landscape Ecology • Research Article ; Soil ; Soil classification ; Soil investigations ; Soil pollution ; Soil Science & Conservation ; Soils ; Spectral bands ; Spectrum analysis</subject><ispartof>Journal of soils and sediments, 2024-11, Vol.24 (11), p.3668-3683</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</cites><orcidid>0000-0001-5189-1614</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11368-024-03914-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11368-024-03914-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,778,782,27907,27908,41471,42540,51302</link.rule.ids></links><search><creatorcontrib>Qi, Chongchong</creatorcontrib><creatorcontrib>Zhou, Min</creatorcontrib><creatorcontrib>Chen, Qiusong</creatorcontrib><creatorcontrib>Hu, Tao</creatorcontrib><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><title>Journal of soils and sediments</title><addtitle>J Soils Sediments</addtitle><description>Purpose
Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management.
Materials and methods
Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model.
Results and discussions
The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds.
Conclusions
This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations.
Graphical abstract</description><subject>Algorithms</subject><subject>Band spectra</subject><subject>Classification</subject><subject>clay</subject><subject>Clay minerals</subject><subject>cost effectiveness</subject><subject>Earth and Environmental Science</subject><subject>Environment</subject><subject>Environmental Physics</subject><subject>Heavy metals</subject><subject>Infrared spectroscopy</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Manganese</subject><subject>Near infrared radiation</subject><subject>Nondestructive testing</subject><subject>Organic compounds</subject><subject>Organic soils</subject><subject>prediction</subject><subject>Predictions</subject><subject>Preprocessing</subject><subject>Sec 5 • Soil and Landscape Ecology • Research Article</subject><subject>Soil</subject><subject>Soil classification</subject><subject>Soil investigations</subject><subject>Soil pollution</subject><subject>Soil Science & Conservation</subject><subject>Soils</subject><subject>Spectral bands</subject><subject>Spectrum analysis</subject><issn>1439-0108</issn><issn>1614-7480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kU1v3CAQhq2qlZqm_QM9IfXSCw0Y29jHKuqXlKiX9IwGPN4QsbBhvJH23h_e2WykSDn0xMA88wh4m-ajVl-0UvaCtDbDKFXbSWUm3Un7qjnTw7HoRvWa685MUmk1vm3eEd0pZSy3z5q_NxVReiCcxRbCbcwoEkLNMW_EtsyYSCylCsy3kANDCeoGJQVIKKjEJK6zCAmI4hIDrLFk4Q8i5hU3lbdseYgUfUKZWcuNpUJlD-0wrLVQKLvD--bNAonww9N63vz5_u3m8qe8-v3j1-XXKxlaY1Y5tZPSGOziAeYw9DCjnw3YwepJ2R61Dl5Pgx-wGxFaH_zS8Rn_j29DmFtz3nw-eXe13O-RVreNFDAlyFj25IzuO90PU2cZ_fQCvSv7mvl2TBk12X4cDFPtiQr8Eqq4uF2NW6gHp5U7BuNOwTgOxj0G445qcxoihvMG67P6P1P_ABN7k7k</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Qi, Chongchong</creator><creator>Zhou, Min</creator><creator>Chen, Qiusong</creator><creator>Hu, Tao</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7ST</scope><scope>7UA</scope><scope>C1K</scope><scope>F1W</scope><scope>H96</scope><scope>H97</scope><scope>L.G</scope><scope>SOI</scope><scope>7S9</scope><scope>L.6</scope><orcidid>https://orcid.org/0000-0001-5189-1614</orcidid></search><sort><creationdate>20241101</creationdate><title>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</title><author>Qi, Chongchong ; Zhou, Min ; Chen, Qiusong ; Hu, Tao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c233t-92901ec7fbaadc65adebd3a76719075e11cb196b6e48ea2bcbf411c007b2ccd23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Band spectra</topic><topic>Classification</topic><topic>clay</topic><topic>Clay minerals</topic><topic>cost effectiveness</topic><topic>Earth and Environmental Science</topic><topic>Environment</topic><topic>Environmental Physics</topic><topic>Heavy metals</topic><topic>Infrared spectroscopy</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Manganese</topic><topic>Near infrared radiation</topic><topic>Nondestructive testing</topic><topic>Organic compounds</topic><topic>Organic soils</topic><topic>prediction</topic><topic>Predictions</topic><topic>Preprocessing</topic><topic>Sec 5 • Soil and Landscape Ecology • Research Article</topic><topic>Soil</topic><topic>Soil classification</topic><topic>Soil investigations</topic><topic>Soil pollution</topic><topic>Soil Science & Conservation</topic><topic>Soils</topic><topic>Spectral bands</topic><topic>Spectrum analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qi, Chongchong</creatorcontrib><creatorcontrib>Zhou, Min</creatorcontrib><creatorcontrib>Chen, Qiusong</creatorcontrib><creatorcontrib>Hu, Tao</creatorcontrib><collection>CrossRef</collection><collection>Environment Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 3: Aquatic Pollution & Environmental Quality</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Environment Abstracts</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><jtitle>Journal of soils and sediments</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qi, Chongchong</au><au>Zhou, Min</au><au>Chen, Qiusong</au><au>Hu, Tao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy</atitle><jtitle>Journal of soils and sediments</jtitle><stitle>J Soils Sediments</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>24</volume><issue>11</issue><spage>3668</spage><epage>3683</epage><pages>3668-3683</pages><issn>1439-0108</issn><eissn>1614-7480</eissn><abstract>Purpose
Given the growing concern over soil heavy metal contamination, there is an increasing need for affordable and precise soil heavy metal information. In particular, efficient and cost-effective methods for detecting soil manganese (Mn), a heavy metal element that is also essential for life processes, hold significant importance. This study employs tree-based machine learning (ML) algorithms with visible-near infrared (VNIR) spectroscopy to enable rapid, non-destructive soil Mn prediction at the continental scale, introducing a novel ML framework with significant implications for soil Mn management.
Materials and methods
Soil spectra were obtained using VNIR spectroscopy and preprocessed using a combination of smoothing and derivative techniques. Three tree-based ML models were constructed for soil Mn prediction: extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and random forest (RF). The spectral bands sensitive to soil Mn were then investigated using the optimal tree-based model.
Results and discussions
The most appropriate preprocessing methods for different tree-based models varied. The XGBoost model performed best, with an area under the curve value of 0.918. The most important bands for the XGBoost model’s soil Mn classification were 1408–1410.5 nm, 2323.5–2325.5 nm, and 2144–2147.5 nm. The main mechanism for the prediction using these bands is the covariant effect with spectrally active substances such as clay minerals, water, and organic compounds.
Conclusions
This study demonstrates that the XGBoost model, when combined with appropriate preprocessing methods, is efficient for predicting soil Mn content. The sensitive spectral bands provide critical insights into the Mn-spectral feature correlations.
Graphical abstract</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s11368-024-03914-7</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-5189-1614</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1439-0108 |
ispartof | Journal of soils and sediments, 2024-11, Vol.24 (11), p.3668-3683 |
issn | 1439-0108 1614-7480 |
language | eng |
recordid | cdi_proquest_miscellaneous_3154156947 |
source | Springer Nature - Complete Springer Journals |
subjects | Algorithms Band spectra Classification clay Clay minerals cost effectiveness Earth and Environmental Science Environment Environmental Physics Heavy metals Infrared spectroscopy Learning algorithms Machine learning Manganese Near infrared radiation Nondestructive testing Organic compounds Organic soils prediction Predictions Preprocessing Sec 5 • Soil and Landscape Ecology • Research Article Soil Soil classification Soil investigations Soil pollution Soil Science & Conservation Soils Spectral bands Spectrum analysis |
title | Tree-based machine learning models for enhanced large-scale soil Mn classification by integrating visible-near infrared spectroscopy |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T12%3A05%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Tree-based%20machine%20learning%20models%20for%20enhanced%20large-scale%20soil%20Mn%20classification%20by%20integrating%20visible-near%20infrared%20spectroscopy&rft.jtitle=Journal%20of%20soils%20and%20sediments&rft.au=Qi,%20Chongchong&rft.date=2024-11-01&rft.volume=24&rft.issue=11&rft.spage=3668&rft.epage=3683&rft.pages=3668-3683&rft.issn=1439-0108&rft.eissn=1614-7480&rft_id=info:doi/10.1007/s11368-024-03914-7&rft_dat=%3Cproquest_cross%3E3154156947%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3130975863&rft_id=info:pmid/&rfr_iscdi=true |