Cloud detection for Landsat imagery by combining the random forest and superpixels extracted via energy-driven sampling segmentation approaches

A primary challenge in cloud detection is associated with highly mixed scenes that are filled with broken and thin clouds over inhomogeneous land. To tackle this challenge, we developed a new algorithm called the Random-Forest-based cloud mask (RFmask), which can improve the accuracy of cloud identi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing of environment 2020-10, Vol.248, p.112005, Article 112005
Hauptverfasser: Wei, Jing, Huang, Wei, Li, Zhanqing, Sun, Lin, Zhu, Xiaolin, Yuan, Qiangqiang, Liu, Lei, Cribb, Maureen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A primary challenge in cloud detection is associated with highly mixed scenes that are filled with broken and thin clouds over inhomogeneous land. To tackle this challenge, we developed a new algorithm called the Random-Forest-based cloud mask (RFmask), which can improve the accuracy of cloud identification from Landsat Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+), and Operational Land Imager and Thermal Infrared Sensor (OLI/TIRS) images. For the development and validation of the algorithm, we first chose the stratified sampling method to pre-select cloudy and clear-sky pixels to form a prior-pixel database according to the land use cover around the world. Next, we select typical spectral channels and calculate spectral indices based on the spectral reflection characteristics of different land cover types using the top-of-atmosphere reflectance and brightness temperature. These are then used as inputs to the RF model for training and establishing a preliminary cloud detection model. Finally, the Super-pixels Extracted via Energy-Driven Sampling (SEEDS) segmentation approach is applied to re-process the preliminary classification results in order to obtain the final cloud detection results. The RFmask detection results are evaluated against the globally distributed United States Geological Survey (USGS) cloud-cover assessment validation products. The average overall accuracy for RFmask cloud detection reaches 93.8% (Kappa coefficient = 0.77) with an omission error of 12.0% and a commission error of 7.4%. The RFmask algorithm is able to identify broken and thin clouds over both dark and bright surfaces. The new model generally outperforms other methods that are compared here, especially over these challenging scenes. The RFmask algorithm is not only accurate but also computationally efficient. It is potentially useful for a variety of applications in using Landsat data, especially for monitoring land cover and land-use changes. •A cloud detection algorithm combining the Random Forest and SEEDS segmentation is proposed for Landsat imagery.•The overall accuracy of the RFmask algorithm reaches 93.8% (Kappa coefficient = 0.77).•The RFmask algorithm works well in detecting broken and thin clouds over both dark and bright surfaces.•The RFmask algorithm is accurate, computationally efficient, and useful for various remote sensing applications.
ISSN:0034-4257
1879-0704
DOI:10.1016/j.rse.2020.112005