Hybrid Tracker with Pixel and Instance for Video Panoptic Segmentation
Video Panoptic Segmentation (VPS) aims to generate coherent panoptic segmentation and track the identities of all pixels across video frames. Existing methods predominantly utilize the trained instance embedding to keep the consistency of panoptic segmentation. However, they inevitably struggle to c...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Video Panoptic Segmentation (VPS) aims to generate coherent panoptic
segmentation and track the identities of all pixels across video frames.
Existing methods predominantly utilize the trained instance embedding to keep
the consistency of panoptic segmentation. However, they inevitably struggle to
cope with the challenges of small objects, similar appearance but inconsistent
identities, occlusion, and strong instance contour deformations. To address
these problems, we present HybridTracker, a lightweight and joint tracking
model attempting to eliminate the limitations of the single tracker.
HybridTracker performs pixel tracker and instance tracker in parallel to obtain
the association matrices, which are fused into a matching matrix. In the
instance tracker, we design a differentiable matching layer, ensuring the
stability of inter-frame matching. In the pixel tracker, we compute the dice
coefficient of the same instance of different frames given the estimated
optical flow, forming the Intersection Over Union (IoU) matrix. We additionally
propose mutual check and temporal consistency constraints during inference to
settle the occlusion and contour deformation challenges. Comprehensive
experiments show that HybridTracker achieves superior performance than
state-of-the-art methods on Cityscapes-VPS and VIPER datasets. |
---|---|
DOI: | 10.48550/arxiv.2203.01217 |