EyeOnWater training dataset for assessing the inclusion of water images
Training dataset The EyeOnWater app is designed to assess the ocean's water quality using images captured by regular citizens. In order to have an extra helping hand in determining whether an image meets the criteria for inclusion in the app, the YOLOv8 model for image classification is employe...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Training dataset
The EyeOnWater app is designed to assess the ocean's water quality using images captured by regular citizens. In order to have an extra helping hand in determining whether an image meets the criteria for inclusion in the app, the YOLOv8 model for image classification is employed. With the help of this model all uploaded pictures are assessed. If the model deems a water image unsuitable, it is excluded from the app's online database. In order to train this model a training dataset containing a large pool of different images is required. The training dataset includes 12,357 'good' and 10,019 'bad' water quality images that were submitted to the EyeOnWater app.
Technical details
Data preprocessing
In order to create a larger training dataset the set of original images (containing a total of 1700 images) are augmented, by rotating, displacing and resizing them. Using the following settings:
Maximum rotation of 45 degrees in both directions
Maximum displacement of 20% times the width or height
Horizontal and vertical flip
Maximum shear range of 20% times the width
Pixel range of 10 units
Data splitting
The training dataset is 80% used for training, 10% for validation and 10% for prediction.
Classes, labels and annotations
The training dataset contains 2 classes with 2 labels 'good' and 'bad'. The 'good' images contain water images that are suited to determine the water quality using the Forel-Ule scale. The 'bad' images can include for example too much water reflection, a visible bottom surface, objects or not even include water at all.
Parameters
From the images the water quality can be obtained by comparing the water color to the 21 colors in the Forel-Ule scale.
Parameter: http://vocab.nerc.ac.uk/collection/P01/current/CLFORULE/
Data sources
The images are taken by citizen scientists, often with a smartphone.
Data quality
As the images are taken by smartphones, the image quality can be low. Next to this, the images are taken outside, in a non-confined space, meaning that there can be bad lightning, reflections and other problems occurring. Therefore, the images need first to be checked before they can be included in the app.
Image resolution
Larger images are resized to 256px by 256px, smaller images are excluded from the training dataset.
Spatial coverage
Images are taken on a global scale.
Contact information
For more information on the training dataset and/or the app, you can contact tjerk@maris.nl. |
---|---|
DOI: | 10.5281/zenodo.10777440 |