Salient object detection in egocentric videos

In the realm of video salient object detection (VSOD), the majority of research has traditionally been centered on third‐person perspective videos. However, this focus overlooks the unique requirements of certain first‐person tasks, such as autonomous driving or robot vision. To bridge this gap, a n...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET Image Processing 2024-06, Vol.18 (8), p.2028-2037
Hauptverfasser: Zhang, Hao, Liang, Haoran, Zhao, Xing, Liu, Jian, Liang, Ronghua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the realm of video salient object detection (VSOD), the majority of research has traditionally been centered on third‐person perspective videos. However, this focus overlooks the unique requirements of certain first‐person tasks, such as autonomous driving or robot vision. To bridge this gap, a novel dataset and a camera‐based VSOD model, CaMSD, specifically designed for egocentric videos, is introduced. First, the SalEgo dataset, comprising 17,400 fully annotated frames for video salient object detection, is presented. Second, a computational model that incorporates a camera movement module is proposed, designed to emulate the patterns observed when humans view videos. Additionally, to achieve precise segmentation of a single salient object during switches between salient objects, as opposed to simultaneously segmenting two objects, a saliency enhancement module based on the Squeeze and Excitation Block is incorporated. Experimental results show that the approach outperforms other state‐of‐the‐art methods in egocentric video salient object detection tasks. Dataset and codes can be found at https://github.com/hzhang1999/SalEgo. We propose a new egocentric video salient object detection (VSOD) dataset SalEgo. And we propose a new Camera Movement based method CaMSD for the new dataset and compare to some models. Experimental results show that our approach outperforms other state‐of‐the‐art methods in egocentric video salient object detection tasks.
ISSN:1751-9659
1751-9667
DOI:10.1049/ipr2.13080