Distortion-aware Transformer in 360{\deg} Salient Object Detection
With the emergence of VR and AR, 360{\deg} data attracts increasing attention from the computer vision and multimedia communities. Typically, 360{\deg} data is projected into 2D ERP (equirectangular projection) images for feature extraction. However, existing methods cannot handle the distortions th...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the emergence of VR and AR, 360{\deg} data attracts increasing attention
from the computer vision and multimedia communities. Typically, 360{\deg} data
is projected into 2D ERP (equirectangular projection) images for feature
extraction. However, existing methods cannot handle the distortions that result
from the projection, hindering the development of 360-data-based tasks.
Therefore, in this paper, we propose a Transformer-based model called DATFormer
to address the distortion problem. We tackle this issue from two perspectives.
Firstly, we introduce two distortion-adaptive modules. The first is a
Distortion Mapping Module, which guides the model to pre-adapt to distorted
features globally. The second module is a Distortion-Adaptive Attention Block
that reduces local distortions on multi-scale features. Secondly, to exploit
the unique characteristics of 360{\deg} data, we present a learnable relation
matrix and use it as part of the positional embedding to further improve
performance. Extensive experiments are conducted on three public datasets, and
the results show that our model outperforms existing 2D SOD (salient object
detection) and 360 SOD methods. |
---|---|
DOI: | 10.48550/arxiv.2308.03359 |