Reconciling Saliency and Object Center-Bias Hypotheses in Explaining Free-Viewing Fixations

Predicting where people look in natural scenes has attracted a lot of interest in computer vision and computational neuroscience over the past two decades. Two seemingly contrasting categories of cues have been proposed to influence where people look: 1) low-level image saliency and 2) high-level se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transaction on neural networks and learning systems 2016-06, Vol.27 (6), p.1214-1226
Hauptverfasser: Borji, Ali, Tanner, James
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Predicting where people look in natural scenes has attracted a lot of interest in computer vision and computational neuroscience over the past two decades. Two seemingly contrasting categories of cues have been proposed to influence where people look: 1) low-level image saliency and 2) high-level semantic information. Our first contribution is to take a detailed look at these cues to confirm the hypothesis proposed by Henderson and Nuthmann and Henderson that observers tend to look at the center of objects. We analyzed fixation data for scene free-viewing over 17 observers on 60 object-annotated images with various types of objects. Images contained different types of scenes, such as natural scenes, line drawings, and 3-D rendered scenes. Our second contribution is to propose a simple combined model of low-level saliency and object center bias that outperforms each individual component significantly over our data, as well as on the Object and Semantic Images and Eye-tracking data set by Xu et al. The results reconcile saliency with object center-bias hypotheses and highlight that both types of cues are important in guiding fixations. Our work opens new directions to understand strategies that humans use in observing scenes and objects, and demonstrates the construction of combined models of low-level saliency and high-level object-based information.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2015.2480683