Mapping soil arsenic pollution at a brownfield site using satellite hyperspectral imagery and machine learning
Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the a...
Gespeichert in:
Veröffentlicht in: | The Science of the total environment 2023-01, Vol.857, p.159387-159387, Article 159387 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Heavy metal contamination is ubiquitous in brownfields. Traditional site investigation employs geostatistical interpolation methods (GIMs) to predict the distribution of soil pollutants after soil sampling and chemical analysis. However, the heterogeneity of soil pollution in brownfields makes the assumptions of GIMs no longer valid and further undermines the accuracy of soil investigation. In the present study, a satellite hyperspectral image processing and machine learning method was developed to map arsenic pollution at a brownfield site. To eliminate the noise caused by atmospheric factors and increase the efficiency of spectral data, 1.3 million spectral indexes (SIs) were constructed and 1171 of them were selected due to their high correlations with soil arsenic. Five machine learning methods, i.e., Random forest (RF), ExtraTrees, Adaptive Boosting, Extreme Gradient Trees, and Gradient Descent Boosting Trees (GDB) were built to predict soil arsenic. The RF method was found to render the best performance (r = 0.78), reducing 30 % of prediction errors compared with traditional GIMs. RF also maintained a relatively higher level of accuracy (r = 0.56) when the sampling grids increased to 100 m, which was higher than that of GIMs under a 50 m sampling grid (r = 0.42), revealing that the proposed method can provide more accurate results with fewer sampling points, namely less investigation cost. It was indicated that the second derivate was the most efficient preprocessing method to remove spectral noise and normalized difference (ND) was the most reliable spectral index construction strategy. Based on uncertainty analysis, the heterogeneity of soil arsenic distribution was considered the most influential factor causing prediction errors. This study demonstrates that machine learning based on satellite visible and near-infrared reflectance spectroscopy (VNIR) is a promising approach to map soil arsenic contamination at brownfield sites with high accuracy and low cost.
[Display omitted]
•1.3 million spectral indexes were constructed.•RF was the best model (r = 0.78).•RF reduces 30 % of prediction error compared to Kriging.•Normalized difference was the most effective spectral index construction strategy.•Satellite hyperspectral imagery can be used to monitoring soil pollution in industrial sites. |
---|---|
ISSN: | 0048-9697 1879-1026 |
DOI: | 10.1016/j.scitotenv.2022.159387 |