Reliable and Dynamic Appearance Modeling and Label Consistency Enforcing for Fast and Coherent Video Object Segmentation With the Bilateral Grid

We propose a novel optimization framework for video object segmentation, given the initial annotations of objects in the keyframes of an input video sequence. In this work, video data is represented by a Markov Random Field model, and segmentation is achieved by finding the minimum graph cut label a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2020-12, Vol.30 (12), p.4781-4795
Hauptverfasser: Gui, Yan, Tian, Ying, Zeng, Dao-Jian, Xie, Zhi-Feng, Cai, Yi-Yu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a novel optimization framework for video object segmentation, given the initial annotations of objects in the keyframes of an input video sequence. In this work, video data is represented by a Markov Random Field model, and segmentation is achieved by finding the minimum graph cut label assignment. More specifically, we first create a bilateral representation of the input video sequence which reduces the size of the graph that the min-cut must operate on. We then introduce dynamic appearance models to learn the segmentation likelihoods, and the reliability of likelihoods is measured to identify false likelihoods that may cause segmentation errors. Thus, the model accurately describes changes in the object's appearance that have evolved over time. Furthermore, we augment spatial and temporal connections using a soft higher-order potential, ensuring long-range label consistency in the segmentation. We provide extensive analysis and evaluation with respect to the influence of each component of the framework through the ablation study. Experiments on three benchmark datasets (DAVIS 2016, YouTube-Objects and SegTrack v2) show that our method achieves competitive performance compared to state-of-the-art while having the order of magnitude faster runtime.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2019.2961267