Attentive and context-aware deep network for saliency prediction on omni-directional images

Understanding visual attention of observers on omni-directional images gains interest along with the booming trend of virtual reality applications. In this paper, we propose a novel attentive and context-aware network for saliency prediction on omni-directional images, which is named as ACSalNet. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digital signal processing 2022-01, Vol.120, p.103289, Article 103289
Hauptverfasser: Qing, Chunmei, Zhu, Huansheng, Xing, Xiaofen, Chen, Dongwen, Jin, Jianxiu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Understanding visual attention of observers on omni-directional images gains interest along with the booming trend of virtual reality applications. In this paper, we propose a novel attentive and context-aware network for saliency prediction on omni-directional images, which is named as ACSalNet. In this architecture, considering the problem of insufficient receptive fields of high-level features, a Deformable Attention Bottleneck (DAB) is first proposed to strengthen the high-level feature extractor and effectively focus the limited receptive field of the model to the key areas. Then, to reduce the semantic gap between features of different levels and introduce context-aware information, we further design a Context-aware Feature Pyramid Module (CFPM). In the testing phase, in order to reduce the error of prediction directly on the equirectangular images while retaining their integrity, a novel projection method called Multiple Sphere Rotation (MSR) is proposed. Extensive experiments illustrate that the proposed method outperforms the state-of-the-art models under different evaluation metrics on the public saliency benchmarks.
ISSN:1051-2004
1095-4333
DOI:10.1016/j.dsp.2021.103289