3D Layout encoding network for spatial-aware 3D saliency modelling

Three-dimensional (3D) [red, green and blue (RGB) + depth] saliency modelling can help with popular 3D multimedia applications. However, depth images produced from existing 3D devices are often with low quality, e.g. containing noises and holes. In this study, rather than relying on features or pred...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET computer vision 2019-08, Vol.13 (5), p.480-488
Hauptverfasser: Yuan, Jing, Cao, Yang, Kang, Yu, Song, Weiguo, Yin, Zhongcheng, Ba, Rui, Ma, Qing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Three-dimensional (3D) [red, green and blue (RGB) + depth] saliency modelling can help with popular 3D multimedia applications. However, depth images produced from existing 3D devices are often with low quality, e.g. containing noises and holes. In this study, rather than relying on features or predictions directly derived from single depth images, the authors propose to encode deep layout features to facilitate the spatial-aware saliency prediction. Specifically, they first generate coarse depth-induced saliency cues which are careless of depth details. Then, to leverage the information of the high-quality RGB image, they embed both low-level and high-level RGB deep features to refine the final prediction. In this way, they take both bottom-up and top-down cues together with spatial layout into account and achieve better saliency modelling results. Experiments on five public datasets show the superiority of the proposed method.
ISSN:1751-9632
1751-9640
1751-9640
DOI:10.1049/iet-cvi.2018.5591