ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation
We introduce ReConvNet, a recurrent convolutional architecture for semi-supervised video object segmentation that is able to fast adapt its features to focus on any specific object of interest at inference time. Generalization to new objects never observed during training is known to be a hard task...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce ReConvNet, a recurrent convolutional architecture for
semi-supervised video object segmentation that is able to fast adapt its
features to focus on any specific object of interest at inference time.
Generalization to new objects never observed during training is known to be a
hard task for supervised approaches that would need to be retrained. To tackle
this problem, we propose a more efficient solution that learns spatio-temporal
features self-adapting to the object of interest via conditional affine
transformations. This approach is simple, can be trained end-to-end and does
not necessarily require extra training steps at inference time. Our method
shows competitive results on DAVIS2016 with respect to state-of-the art
approaches that use online fine-tuning, and outperforms them on DAVIS2017.
ReConvNet shows also promising results on the DAVIS-Challenge 2018 winning the
$10$-th position. |
---|---|
DOI: | 10.48550/arxiv.1806.05510 |