Learning Saliency Prediction From Sparse Fixation Pixel Map
Ground truth for saliency prediction datasets consists of two types of map data: fixation pixel map which records the human eye movements on sample images, and fixation blob map generated by performing gaussian blurring on the corresponding fixation pixel map. Current saliency approaches perform pre...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Ground truth for saliency prediction datasets consists of two types of map
data: fixation pixel map which records the human eye movements on sample
images, and fixation blob map generated by performing gaussian blurring on the
corresponding fixation pixel map. Current saliency approaches perform
prediction by directly pixel-wise regressing the input image into saliency map
with fixation blob as ground truth, yet learning saliency from fixation pixel
map is not explored. In this work, we propose a first-of-its-kind approach of
learning saliency prediction from sparse fixation pixel map, and a novel loss
function for training from such sparse fixation. We utilize clustering to
extract sparse fixation pixel from the raw fixation pixel map, and add a
max-pooling transformation on the output to avoid false penalty between sparse
outputs and labels caused by nearby but non-overlapping saliency pixels when
calculating loss. This approach provides a novel perspective for achieving
saliency prediction. We evaluate our approach over multiple benchmark datasets,
and achieve competitive performance in terms of multiple metrics comparing with
state-of-the-art saliency methods. |
---|---|
DOI: | 10.48550/arxiv.1809.00644 |