Joint pyramidal perceptual attention and hierarchical consistency constraint for gaze estimation
Eye gaze provides valuable cues about human intent, making gaze estimation a hot topic. Extracting multi-scale information has recently proven effective for gaze estimation in complex scenarios. However, existing methods for estimating gaze based on multi-scale features tend to focus only on informa...
Gespeichert in:
Veröffentlicht in: | Computer vision and image understanding 2024-11, Vol.248, p.104105, Article 104105 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Eye gaze provides valuable cues about human intent, making gaze estimation a hot topic. Extracting multi-scale information has recently proven effective for gaze estimation in complex scenarios. However, existing methods for estimating gaze based on multi-scale features tend to focus only on information from single-level feature maps. Furthermore, information across different scales may also lack relevance. To address these issues, we propose a novel joint pyramidal perceptual attention and hierarchical consistency constraint (PaCo) for gaze estimation. The proposed PaCo consists of two main components: pyramidal perceptual attention module (PPAM) and hierarchical consistency constraint (HCC). Specifically, PPAM first extracts multi-scale spatial features using a pyramid structure, and then aggregates information from coarse granularity to fine granularity. In this way, PPAM enables the model to simultaneously focus on both the eye region and facial region at multiple scales. Then, HCC makes constrains consistency on low-level and high-level features, aiming to enhance the gaze semantic consistency between different feature levels. With the combination of PPAM and HCC, PaCo can learn more discriminative features in complex situations. Extensive experimental results show that PaCo achieves significant performance improvements on challenging datasets such as Gaze360, MPIIFaceGaze, and RT-GENE,reducing errors to 10.27°, 3.23°, 6.46°, respectively.
•We propose a network for gaze estimation with pyramidal attention and consistency.•Pyramidal Perceptual Attention Module extracts multi-scale spatial features.•Hierarchical Consistency Constraint enhances gaze semantic consistency. |
---|---|
ISSN: | 1077-3142 |
DOI: | 10.1016/j.cviu.2024.104105 |