Autonomous Temporal Pseudo-Labeling for Fish Detection

The first major step in training an object detection model to different classes from the available datasets is the gathering of meaningful and properly annotated data. This recurring task will determine the length of any project, and, more importantly, the quality of the resulting models. This obsta...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2022-06, Vol.12 (12), p.5910
Hauptverfasser:	Veiga, Ricardo J. M., Ochoa, Iñigo E., Belackova, Adela, Bentes, Luís, Silva, João P., Semião, Jorge, Rodrigues, João M. F.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Annotations Artificial intelligence Automation Classification Datasets environmental monitoring Fauna Fish fish detection Human error Human performance Labeling Labels Marine animals Marine fauna Marine fish marine fishes object detection Oceans pseudo-labeling underwater video
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The first major step in training an object detection model to different classes from the available datasets is the gathering of meaningful and properly annotated data. This recurring task will determine the length of any project, and, more importantly, the quality of the resulting models. This obstacle is amplified when the data available for the new classes are scarce or incompatible, as in the case of fish detection in the open sea. This issue was tackled using a mixed and reversed approach: a network is initiated with a noisy dataset of the same species as our classes (fish), although in different scenarios and conditions (fish from Australian marine fauna), and we gathered the target footage (fish from Portuguese marine fauna; Atlantic Ocean) for the application without annotations. Using the temporal information of the detected objects and augmented techniques during later training, it was possible to generate highly accurate labels from our targeted footage. Furthermore, the data selection method retained the samples of each unique situation, filtering repetitive data, which would bias the training process. The obtained results validate the proposed method of automating the labeling processing, resorting directly to the final application as the source of training data. The presented method achieved a mean average precision of 93.11% on our own data, and 73.61% on unseen data, an increase of 24.65% and 25.53% over the baseline of the noisy dataset, respectively.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app12125910