Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning

Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Science of the total environment 2023-01, Vol.857, p.159387-159387, Article 159387
Hauptverfasser: Jia, Xiyue, Hou, Deyi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 159387
container_issue
container_start_page 159387
container_title The Science of the total environment
container_volume 857
creator Jia, Xiyue
Hou, Deyi
description Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the assumptions of GIMs no longer valid and further undermines the accuracy of soil investigation. In the present study, a satellite hyperspectral image processing and machine learning method was developed to map arsenic pollution at a brownfield site. To eliminate the noise caused by atmospheric factors and increase the efficiency of spectral data, 1.3 million spectral indexes (SIs) were constructed and 1171 of them were selected due to their high correlations with soil arsenic. Five machine learning methods, i.e., Random forest (RF), ExtraTrees, Adaptive Boosting, Extreme Gradient Trees, and Gradient Descent Boosting Trees (GDB) were built to predict soil arsenic. The RF method was found to render the best performance (r = 0.78), reducing 30 % of prediction errors compared with traditional GIMs. RF also maintained a relatively higher level of accuracy (r = 0.56) when the sampling grids increased to 100 m, which was higher than that of GIMs under a 50 m sampling grid (r = 0.42), revealing that the proposed method can provide more accurate results with fewer sampling points, namely less investigation cost. It was indicated that the second derivate was the most efficient preprocessing method to remove spectral noise and normalized difference (ND) was the most reliable spectral index construction strategy. Based on uncertainty analysis, the heterogeneity of soil arsenic distribution was considered the most influential factor causing prediction errors. This study demonstrates that machine learning based on satellite visible and near-infrared reflectance spectroscopy (VNIR) is a promising approach to map soil arsenic contamination at brownfield sites with high accuracy and low cost. [Display omitted] •1.3 million spectral indexes were constructed.•RF was the best model (r = 0.78).•RF reduces 30 % of prediction error compared to Kriging.•Normalized difference was the most effective spectral index construction strategy.•Satellite hyperspectral imagery can be used to monitoring soil pollution in industrial sites.
doi_str_mv 10.1016/j.scitotenv.2022.159387
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2725201045</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0048969722064865</els_id><sourcerecordid>2725201045</sourcerecordid><originalsourceid>FETCH-LOGICAL-c278t-1013e92fa256e55c4916b7bdcd35d14a8d92ce1472d9a6c15605a8a5f1bacbc13</originalsourceid><addsrcrecordid>eNqFkEFvGyEQhVGVSnXc_oZyzMUu4GXZPVpW00RylUt6RrMwa2Nh2AJ25X9fHEe9di6jkd57mvcR8pWzJWe8_XZYZuNKLBjOS8GEWHLZrzr1gcx4p_oFZ6K9IzPGmm7Rt736RO5zPrA6quMzEn7CNLmwozk6TyFlDM7QKXp_Ki4GCoUCHVL8E0aH3tLsCtJTfnNAQe-v9_4yYcoTmpLAU3eEHaYLhWDpEczeBaQeIYVq-kw-juAzfnnfc_Lr8fvr5mmxffnxvFlvF0aortSn-Qp7MYKQLUppmp63gxqssStpeQOd7YVB3ihhe2gNly2T0IEc-QBmMHw1Jw-33CnF3yfMRR9dNvVdCBhPWQslpGCcNbJK1U1qUsw54ainVCuki-ZMXwnrg_5HWF8J6xvh6lzfnFibnB2mqw6DQetSZaFtdP_N-AthKIwC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2725201045</pqid></control><display><type>article</type><title>Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning</title><source>Elsevier ScienceDirect Journals</source><creator>Jia, Xiyue ; Hou, Deyi</creator><creatorcontrib>Jia, Xiyue ; Hou, Deyi</creatorcontrib><description>Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the assumptions of GIMs no longer valid and further undermines the accuracy of soil investigation. In the present study, a satellite hyperspectral image processing and machine learning method was developed to map arsenic pollution at a brownfield site. To eliminate the noise caused by atmospheric factors and increase the efficiency of spectral data, 1.3 million spectral indexes (SIs) were constructed and 1171 of them were selected due to their high correlations with soil arsenic. Five machine learning methods, i.e., Random forest (RF), ExtraTrees, Adaptive Boosting, Extreme Gradient Trees, and Gradient Descent Boosting Trees (GDB) were built to predict soil arsenic. The RF method was found to render the best performance (r = 0.78), reducing 30 % of prediction errors compared with traditional GIMs. RF also maintained a relatively higher level of accuracy (r = 0.56) when the sampling grids increased to 100 m, which was higher than that of GIMs under a 50 m sampling grid (r = 0.42), revealing that the proposed method can provide more accurate results with fewer sampling points, namely less investigation cost. It was indicated that the second derivate was the most efficient preprocessing method to remove spectral noise and normalized difference (ND) was the most reliable spectral index construction strategy. Based on uncertainty analysis, the heterogeneity of soil arsenic distribution was considered the most influential factor causing prediction errors. This study demonstrates that machine learning based on satellite visible and near-infrared reflectance spectroscopy (VNIR) is a promising approach to map soil arsenic contamination at brownfield sites with high accuracy and low cost. [Display omitted] •1.3 million spectral indexes were constructed.•RF was the best model (r = 0.78).•RF reduces 30 % of prediction error compared to Kriging.•Normalized difference was the most effective spectral index construction strategy.•Satellite hyperspectral imagery can be used to monitoring soil pollution in industrial sites.</description><identifier>ISSN: 0048-9697</identifier><identifier>EISSN: 1879-1026</identifier><identifier>DOI: 10.1016/j.scitotenv.2022.159387</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Arsenic contamination ; Machine learning ; Remote sensing ; Satellite hyperspectral imagery ; Soil pollution</subject><ispartof>The Science of the total environment, 2023-01, Vol.857, p.159387-159387, Article 159387</ispartof><rights>2022 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c278t-1013e92fa256e55c4916b7bdcd35d14a8d92ce1472d9a6c15605a8a5f1bacbc13</citedby><cites>FETCH-LOGICAL-c278t-1013e92fa256e55c4916b7bdcd35d14a8d92ce1472d9a6c15605a8a5f1bacbc13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0048969722064865$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,65309</link.rule.ids></links><search><creatorcontrib>Jia, Xiyue</creatorcontrib><creatorcontrib>Hou, Deyi</creatorcontrib><title>Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning</title><title>The Science of the total environment</title><description>Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the assumptions of GIMs no longer valid and further undermines the accuracy of soil investigation. In the present study, a satellite hyperspectral image processing and machine learning method was developed to map arsenic pollution at a brownfield site. To eliminate the noise caused by atmospheric factors and increase the efficiency of spectral data, 1.3 million spectral indexes (SIs) were constructed and 1171 of them were selected due to their high correlations with soil arsenic. Five machine learning methods, i.e., Random forest (RF), ExtraTrees, Adaptive Boosting, Extreme Gradient Trees, and Gradient Descent Boosting Trees (GDB) were built to predict soil arsenic. The RF method was found to render the best performance (r = 0.78), reducing 30 % of prediction errors compared with traditional GIMs. RF also maintained a relatively higher level of accuracy (r = 0.56) when the sampling grids increased to 100 m, which was higher than that of GIMs under a 50 m sampling grid (r = 0.42), revealing that the proposed method can provide more accurate results with fewer sampling points, namely less investigation cost. It was indicated that the second derivate was the most efficient preprocessing method to remove spectral noise and normalized difference (ND) was the most reliable spectral index construction strategy. Based on uncertainty analysis, the heterogeneity of soil arsenic distribution was considered the most influential factor causing prediction errors. This study demonstrates that machine learning based on satellite visible and near-infrared reflectance spectroscopy (VNIR) is a promising approach to map soil arsenic contamination at brownfield sites with high accuracy and low cost. [Display omitted] •1.3 million spectral indexes were constructed.•RF was the best model (r = 0.78).•RF reduces 30 % of prediction error compared to Kriging.•Normalized difference was the most effective spectral index construction strategy.•Satellite hyperspectral imagery can be used to monitoring soil pollution in industrial sites.</description><subject>Arsenic contamination</subject><subject>Machine learning</subject><subject>Remote sensing</subject><subject>Satellite hyperspectral imagery</subject><subject>Soil pollution</subject><issn>0048-9697</issn><issn>1879-1026</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNqFkEFvGyEQhVGVSnXc_oZyzMUu4GXZPVpW00RylUt6RrMwa2Nh2AJ25X9fHEe9di6jkd57mvcR8pWzJWe8_XZYZuNKLBjOS8GEWHLZrzr1gcx4p_oFZ6K9IzPGmm7Rt736RO5zPrA6quMzEn7CNLmwozk6TyFlDM7QKXp_Ki4GCoUCHVL8E0aH3tLsCtJTfnNAQe-v9_4yYcoTmpLAU3eEHaYLhWDpEczeBaQeIYVq-kw-juAzfnnfc_Lr8fvr5mmxffnxvFlvF0aortSn-Qp7MYKQLUppmp63gxqssStpeQOd7YVB3ihhe2gNly2T0IEc-QBmMHw1Jw-33CnF3yfMRR9dNvVdCBhPWQslpGCcNbJK1U1qUsw54ainVCuki-ZMXwnrg_5HWF8J6xvh6lzfnFibnB2mqw6DQetSZaFtdP_N-AthKIwC</recordid><startdate>20230120</startdate><enddate>20230120</enddate><creator>Jia, Xiyue</creator><creator>Hou, Deyi</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20230120</creationdate><title>Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning</title><author>Jia, Xiyue ; Hou, Deyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c278t-1013e92fa256e55c4916b7bdcd35d14a8d92ce1472d9a6c15605a8a5f1bacbc13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Arsenic contamination</topic><topic>Machine learning</topic><topic>Remote sensing</topic><topic>Satellite hyperspectral imagery</topic><topic>Soil pollution</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jia, Xiyue</creatorcontrib><creatorcontrib>Hou, Deyi</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>The Science of the total environment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jia, Xiyue</au><au>Hou, Deyi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning</atitle><jtitle>The Science of the total environment</jtitle><date>2023-01-20</date><risdate>2023</risdate><volume>857</volume><spage>159387</spage><epage>159387</epage><pages>159387-159387</pages><artnum>159387</artnum><issn>0048-9697</issn><eissn>1879-1026</eissn><abstract>Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the assumptions of GIMs no longer valid and further undermines the accuracy of soil investigation. In the present study, a satellite hyperspectral image processing and machine learning method was developed to map arsenic pollution at a brownfield site. To eliminate the noise caused by atmospheric factors and increase the efficiency of spectral data, 1.3 million spectral indexes (SIs) were constructed and 1171 of them were selected due to their high correlations with soil arsenic. Five machine learning methods, i.e., Random forest (RF), ExtraTrees, Adaptive Boosting, Extreme Gradient Trees, and Gradient Descent Boosting Trees (GDB) were built to predict soil arsenic. The RF method was found to render the best performance (r = 0.78), reducing 30 % of prediction errors compared with traditional GIMs. RF also maintained a relatively higher level of accuracy (r = 0.56) when the sampling grids increased to 100 m, which was higher than that of GIMs under a 50 m sampling grid (r = 0.42), revealing that the proposed method can provide more accurate results with fewer sampling points, namely less investigation cost. It was indicated that the second derivate was the most efficient preprocessing method to remove spectral noise and normalized difference (ND) was the most reliable spectral index construction strategy. Based on uncertainty analysis, the heterogeneity of soil arsenic distribution was considered the most influential factor causing prediction errors. This study demonstrates that machine learning based on satellite visible and near-infrared reflectance spectroscopy (VNIR) is a promising approach to map soil arsenic contamination at brownfield sites with high accuracy and low cost. [Display omitted] •1.3 million spectral indexes were constructed.•RF was the best model (r = 0.78).•RF reduces 30 % of prediction error compared to Kriging.•Normalized difference was the most effective spectral index construction strategy.•Satellite hyperspectral imagery can be used to monitoring soil pollution in industrial sites.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.scitotenv.2022.159387</doi><tpages>1</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0048-9697
ispartof The Science of the total environment, 2023-01, Vol.857, p.159387-159387, Article 159387
issn 0048-9697
1879-1026
language eng
recordid cdi_proquest_miscellaneous_2725201045
source Elsevier ScienceDirect Journals
subjects Arsenic contamination
Machine learning
Remote sensing
Satellite hyperspectral imagery
Soil pollution
title Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T02%3A46%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mapping%20soil%20arsenic%20pollution%20at%20a%20brownfield%20site%20using%20satellite%20hyperspectral%20imagery%20and%20machine%20learning&rft.jtitle=The%20Science%20of%20the%20total%20environment&rft.au=Jia,%20Xiyue&rft.date=2023-01-20&rft.volume=857&rft.spage=159387&rft.epage=159387&rft.pages=159387-159387&rft.artnum=159387&rft.issn=0048-9697&rft.eissn=1879-1026&rft_id=info:doi/10.1016/j.scitotenv.2022.159387&rft_dat=%3Cproquest_cross%3E2725201045%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2725201045&rft_id=info:pmid/&rft_els_id=S0048969722064865&rfr_iscdi=true