Segmentation masks of ZooScan images focusing on images with several objects separated by a human operator
The first step in many image analysis tasks is the segmentation of objects of interest from a full image. This is the case for ZooScan images. The ZooScan is a waterproof flatbed scanner dedicated to the digitization of samples of zooplankton, from sizes of 300µm and up. The jar of plankton is poure...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The first step in many image analysis tasks is the segmentation of objects of interest from a full image. This is the case for ZooScan images. The ZooScan is a waterproof flatbed scanner dedicated to the digitization of samples of zooplankton, from sizes of 300µm and up. The jar of plankton is poured on the scanning window, objects are physically separated as best as possible and the image is acquired. After background subtraction, the full grayscale image is segmented based on a simple grey intensity threshold and each segmented object is measured (in terms of area, transparency etc.). These segments, usually called "vignettes", are then classified taxonomically, often through the help of machine learning based on the measurements. The measurements also allow estimating the size and volume of each object.
Despite the carefulness of operators, it is frequent for some of the 1000 to 2000 vignettes typically detected on a single scan to contain more than one object, hence biassing the measurements and the further quantification of concentration and biovolume of plankton. To avoid this, operators go back on the initial full frame and digitally separate touching objects by drawing white lines between them. This dataset contains ~14k vignettes with objects separated by white lines, ~5k vignettes of single, correctly detected objects as well as the binary masks of all of them. This can be used to train deep learning segmentation models, such as semantic, instance or panoptic segmenters. All these images were acquired with a ZooScan, from samples taken by a WP2 net in various places of the world, during the Tara Oceans cruise.
## Data preprocessing
The full zooscan image gets its background subtracted by ZooProcess. Then contiguous regions are detected using a connected component algorithm (that considers neighbouring pixels along the diagonal to be touching too). The pre-processed (background subtracted) scan and the mask resulting from manual separation with white lines are cropped to the regions of interest detected.
## Data splitting
The dataset is split in ~70% training set, 15% validation set, 15% test set.
## Classes, labels and annotations
All splits are organised the same way: an images directory, with grayscale png images of objects + a masks directory with binary png masks of objects to be detected.
When the binary mask contains only one region, the object is a single plankter.
## Parameters
The dataset does not contain or allow the computation of any |
---|---|
DOI: | 10.17882/99663 |