Joint pyramidal perceptual attention and hierarchical consistency constraint for gaze estimation

Eye gaze provides valuable cues about human intent, making gaze estimation a hot topic. Extracting multi-scale information has recently proven effective for gaze estimation in complex scenarios. However, existing methods for estimating gaze based on multi-scale features tend to focus only on informa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer vision and image understanding 2024-11, Vol.248, p.104105, Article 104105
Hauptverfasser: Xia, Haiying, Gong, Zhuolin, Tan, Yumei, Song, Shuxiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Eye gaze provides valuable cues about human intent, making gaze estimation a hot topic. Extracting multi-scale information has recently proven effective for gaze estimation in complex scenarios. However, existing methods for estimating gaze based on multi-scale features tend to focus only on information from single-level feature maps. Furthermore, information across different scales may also lack relevance. To address these issues, we propose a novel joint pyramidal perceptual attention and hierarchical consistency constraint (PaCo) for gaze estimation. The proposed PaCo consists of two main components: pyramidal perceptual attention module (PPAM) and hierarchical consistency constraint (HCC). Specifically, PPAM first extracts multi-scale spatial features using a pyramid structure, and then aggregates information from coarse granularity to fine granularity. In this way, PPAM enables the model to simultaneously focus on both the eye region and facial region at multiple scales. Then, HCC makes constrains consistency on low-level and high-level features, aiming to enhance the gaze semantic consistency between different feature levels. With the combination of PPAM and HCC, PaCo can learn more discriminative features in complex situations. Extensive experimental results show that PaCo achieves significant performance improvements on challenging datasets such as Gaze360, MPIIFaceGaze, and RT-GENE,reducing errors to 10.27°, 3.23°, 6.46°, respectively. •We propose a network for gaze estimation with pyramidal attention and consistency.•Pyramidal Perceptual Attention Module extracts multi-scale spatial features.•Hierarchical Consistency Constraint enhances gaze semantic consistency.
ISSN:1077-3142
DOI:10.1016/j.cviu.2024.104105