Occlusion-Resistant Instance Segmentation of Piglets in Farrowing Pens Using Center Clustering Network

Computer vision enables the development of new approaches to monitor the behavior, health, and welfare of animals. Instance segmentation is a high-precision method in computer vision for detecting individual animals of interest. This method can be used for in-depth analysis of animals, such as exami...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Huang, Endai, Mao, Axiu, Hou, Junhui, Wu, Yongjian, Xu, Weitao, Ceballos, Maria Camila, Parsons, Thomas D, Liu, Kai
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Computer vision enables the development of new approaches to monitor the behavior, health, and welfare of animals. Instance segmentation is a high-precision method in computer vision for detecting individual animals of interest. This method can be used for in-depth analysis of animals, such as examining their subtle interactive behaviors, from videos and images. However, existing deep-learning-based instance segmentation methods have been mostly developed based on public datasets, which largely omit heavy occlusion problems; therefore, these methods have limitations in real-world applications involving object occlusions, such as farrowing pen systems used on pig farms in which the farrowing crates often impede the sow and piglets. In this paper, we adapt a Center Clustering Network originally designed for counting to achieve instance segmentation, dubbed as CClusnet-Inseg. Specifically, CClusnet-Inseg uses each pixel to predict object centers and trace these centers to form masks based on clustering results, which consists of a network for segmentation and center offset vector map, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, Centers-to-Mask (C2M), and Remain-Centers-to-Mask (RC2M) algorithms. In all, 4,600 images were extracted from six videos collected from three closed and three half-open farrowing crates to train and validate our method. CClusnet-Inseg achieves a mean average precision (mAP) of 84.1 and outperforms all other methods compared in this study. We conduct comprehensive ablation studies to demonstrate the advantages and effectiveness of core modules of our method. In addition, we apply CClusnet-Inseg to multi-object tracking for animal monitoring, and the predicted object center that is a conjunct output could serve as an occlusion-resistant representation of the location of an object.
DOI:	10.48550/arxiv.2206.01942