EFFECT OF DATA QUALITY ON WATER BODY SEGMENTATION WITH DEEPLABV3+ ALGORITHM

Training Deep Learning (DL) algorithms for segmenting features require hundreds to thousands of input data and corresponding labels. Generating thousands of input images and labels requires considerable resources and time. Hence, it is common practice to use opensource imagery data and labels availa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International archives of the photogrammetry, remote sensing and spatial information sciences. remote sensing and spatial information sciences., 2023-09, Vol.XLVIII-M-3-2023, p.81-85
Hauptverfasser:	Edpuganti, A., Akshaya, P., Gouthami, J., Sajith Variyar, V. V., Sowmya, V., Sivanpillai, R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Color imagery Image quality Image segmentation Labels Machine learning Model accuracy Open data Open source software Quality assessment Training Water bodies Water quality
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Training Deep Learning (DL) algorithms for segmenting features require hundreds to thousands of input data and corresponding labels. Generating thousands of input images and labels requires considerable resources and time. Hence, it is common practice to use opensource imagery data and labels available online. Most of these open-source data have little or no metadata describing their quality or suitability making it problematic for training or evaluating DL models. This study evaluated the effect of data quality on training DeepLabV3+, using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We generated subsets of 256 × 256 pixels, and 10% of these images (802) were set aside for testing. First, we trained and validated the DeepLabV3+ model with the remaining images. Second, we removed images with incorrect labels and trained another DeepLabV3+ network. Finally, we trained the third DeepLabV3+ network after removing images with turbid water or with floating vegetation. All three trained models were evaluated with test images and then we calculated accuracy metrics. As the quality of the input images improved, accuracy of the predicted masks generated from the first model increased from 92.8% to 94.3% in the second model. The third model’s accuracy was 96.4%, demonstrating the network’s ability to better learn and predict water bodies when the input data had fewer class variations. Based on the results we recommend assessing the quality of open-source data for incorrect labels and variations in the target class prior to training DeepLabV3+ or any other DL network.
ISSN:	2194-9034 1682-1750 2194-9034
DOI:	10.5194/isprs-archives-XLVIII-M-3-2023-81-2023