One-Shot Learning-Based Animal Video Segmentation

Deep learning-based video segmentation methods can offer a good performance after being trained on the large-scale pixel labeled datasets. However, a pixel-wise manual labeling of animal images is challenging and time consuming due to irregular contours and motion blur. To achieve desirable tradeoff...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on industrial informatics 2022-06, Vol.18 (6), p.3799-3807
Hauptverfasser:	Xue, Tengfei, Qiao, Yongliang, Kong, He, Su, Daobilige, Pan, Shirui, Rafique, Khalid, Sukkarieh, Salah
Format:	Artikel
Sprache:	eng
Schlagworte:	Animal monitoring Animals Blurring Conditional random fields Contours convolutional neural network (CNN) Datasets Deep learning Feature extraction Image segmentation Informatics Labeling Labelling one-shot learning Outliers (statistics) Pixels Source code Streaming media Testing time video segmentation Video sequences
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep learning-based video segmentation methods can offer a good performance after being trained on the large-scale pixel labeled datasets. However, a pixel-wise manual labeling of animal images is challenging and time consuming due to irregular contours and motion blur. To achieve desirable tradeoffs between the accuracy and speed, a novel one-shot learning-based approach is proposed in this article to segment animal video with only one labeled frame. The proposed approach consists of the following three main modules: guidance frame selection utilizes "BubbleNet" to choose one frame for manual labeling, which can leverage the fine-tuning effects of the only labeled frame; Xception-based fully convolutional network localizes dense prediction using depthwise separable convolutions based on one single labeled frame; and postprocessing is used to remove outliers and sharpen object contours, which consists of two submodules-test time augmentation and conditional random field. Extensive experiments have been conducted on the DAVIS 2016 animal dataset. Our proposed video segmentation approach achieved mean intersection-over-union score of 89.5% on the DAVIS 2016 animal dataset with less run time, and outperformed the state-of-art methods (OSVOS and OSMN). The proposed one-shot learning-based approach achieves real-time and automatic segmentation of animals with only one labeled video frame. This can be potentially used further as a baseline for intelligent perception-based monitoring of animals and other domain-specific applications. 1 1 The source code, datasets, and pre-trained weights for this work are publicly [Online]. Available: https://github.com/tengfeixue-victor/One-Shot-Animal-Video-Segmentation .
ISSN:	1551-3203 1941-0050
DOI:	10.1109/TII.2021.3117020