Video Region Annotation with Sparse Bounding Boxes

Video analysis has been moving towards more detailed interpretation (e.g., segmentation) with encouraging progress. These tasks, however, increasingly rely on densely annotated training data both in space and time. Since such annotation is labor-intensive, few densely annotated video data with detai...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of computer vision 2023-03, Vol.131 (3), p.717-731
Hauptverfasser:	Xu, Yuzheng, Wu, Yang, binti Zuraimi, Nur Sabrina, Nobuhara, Shohei, Nishino, Ko
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Algorithms Annotations Artificial Intelligence Boundaries Boxes Computer Imaging Computer Science Datasets Deep learning Global optimization Image Processing and Computer Vision Pattern Recognition Pattern Recognition and Graphics Special Issue on Computer Vision from 2D to 3D Video data Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Video analysis has been moving towards more detailed interpretation (e.g., segmentation) with encouraging progress. These tasks, however, increasingly rely on densely annotated training data both in space and time. Since such annotation is labor-intensive, few densely annotated video data with detailed region boundaries exist. This work aims to resolve this dilemma by learning to automatically generate region boundaries for all frames of a video from sparsely annotated bounding boxes of target regions. We achieve this with a Volumetric Graph Convolutional Network (VGCN), which learns to iteratively find keypoints on the region boundaries using the spatio-temporal volume of surrounding appearance and motion. We show that the global optimization of VGCN leads to more accurate annotation that generalizes better. Experimental results using three latest datasets (two real and one synthetic), including ablation studies, demonstrate the effectiveness and superiority of our method.
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-022-01719-0