Video Object Segmentation Using Kernelized Memory Network With Multiple Kernels
Semi-supervised video object segmentation (VOS) is to predict the segment of a target object in a video when a ground truth segmentation mask for the target is given in the first frame. Recently, space-time memory networks (STM) have received significant attention as a promising approach for semi-su...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence 2023-02, Vol.45 (2), p.2595-2612 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Semi-supervised video object segmentation (VOS) is to predict the segment of a target object in a video when a ground truth segmentation mask for the target is given in the first frame. Recently, space-time memory networks (STM) have received significant attention as a promising approach for semi-supervised VOS. However, an important point has been overlooked in applying STM to VOS: The solution (=STM) is non-local, but the problem (=VOS) is predominantly local. To solve this mismatch between STM and VOS, we propose new VOS networks called kernelized memory network (KMN) and KMN with multiple kernels (KMN^{M} M ). Our networks conduct not only Query-to-Memory matching but also Memory-to-Query matching. In Memory-to-Query matching, a kernel is employed to reduce the degree of non-localness of the STM. In addition, we present a Hide-and-Seek strategy in pre-training to handle occlusions effectively. The proposed networks surpass the state-of-the-art results on standard benchmarks by a significant margin (+4% in \mathcal {J_{M}} JM on DAVIS 2017 test-dev set). The runtimes of our proposed KMN and KMN^{M} M on DAVIS 2016 validation set are 0.12 and 0.13 seconds per frame, respectively, and the two networks have similar computation times to STM. |
---|---|
ISSN: | 0162-8828 1939-3539 2160-9292 |
DOI: | 10.1109/TPAMI.2022.3163375 |