Learning a Robust Part-Aware Monocular 3D Human Pose Estimator via Neural Architecture Search

Even though most existing monocular 3D human pose estimation methods achieve very competitive performance, they are limited in estimating heterogeneous human body parts with the same decoder architecture. In this work, we present an approach to build a part-aware 3D human pose estimator to better de...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of computer vision 2022, Vol.130 (1), p.56-75
Hauptverfasser:	Chen, Zerui, Huang, Yan, Yu, Hongyuan, Wang, Liang
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Accuracy Activities and Shape in 3D Artificial Intelligence Body parts Computer Imaging Computer Science Deep learning Estimation Human body Image Processing and Computer Vision Learning Methods Motion Neural networks Pattern Recognition Pattern Recognition and Graphics Pose estimation Random variables Special Issue on Human Pose Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Even though most existing monocular 3D human pose estimation methods achieve very competitive performance, they are limited in estimating heterogeneous human body parts with the same decoder architecture. In this work, we present an approach to build a part-aware 3D human pose estimator to better deal with these heterogeneous human body parts. Our proposed method consists of two learning stages: (1) searching suitable decoder architectures for specific parts and (2) training the part-aware 3D human pose estimator built with these optimized neural architectures. Consequently, our searched model is very efficient and compact and can automatically select a suitable decoder architecture to estimate each human body part. In comparison with previous state-of-the-art models built with ResNet-50 network, our method can achieve better performance and reduce 64.4% parameters and 8.5% FLOPs (multiply-adds). We validate the robustness and stability of our searched models by conducting extensive and rigorous ablation experiments. Our method can advance state-of-the-art accuracy on both the single-person and multi-person 3D human pose estimation benchmarks with affordable computational cost.
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-021-01525-0