Lightweight head pose estimation without keypoints based on multi-scale lightweight neural network

Head pose estimation methods without facial key points have emerged as a promising research field. However, there remain several unsolved challenges. For example, the current methods incur a computational cost, require large memory, and are difficult to deploy in practical applications. We propose a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Visual computer 2023-06, Vol.39 (6), p.2455-2469
Hauptverfasser:	Chen, Xiaolei, Lu, Yubing, Cao, Baoning, Lin, Dongmei, Ahmad, Ishfaq
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial Intelligence Artificial neural networks Classification Computer Graphics Computer networks Computer Science Computing costs Feature maps Image Processing and Computer Vision Lightweight Methods Modules Neural networks Normal distribution Original Article Pose estimation Statistical analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Head pose estimation methods without facial key points have emerged as a promising research field. However, there remain several unsolved challenges. For example, the current methods incur a computational cost, require large memory, and are difficult to deploy in practical applications. We propose a lightweight high-precision head pose estimation method based on a dual-stream convolutional neural network for overcoming these issues. The network comprises a dual-stream lightweight backbone network, external attention module, and soft stagewise regression (SSR) module. Dual-stream lightweight backbone network can extract original image features more effectively while keeping low computational overhead. External attention module can enhance the feature map extraction from the backbone network and improve the feature attention. SSR module calculates the probability of the head in each direction and predicts the head pose by regression. Extensive experiments on Annotated Facial Landmarks in the Wild (AFLW2000) and Biwi Kinect Head Pose Database (BIWI) datasets demonstrate that the model proposed in this paper has fewer parameters and lower estimation errors than the state-of-the-art methods in the field of head pose estimation in recent years.
ISSN:	0178-2789 1432-2315
DOI:	10.1007/s00371-023-02781-6