Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. Recently, benefiting from the deep learning technologies, a significant amount of research efforts have advanced the monocular human pose estimation both...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM computing surveys 2022-11, Vol.55 (4), p.1-41, Article 80
Hauptverfasser: Liu, Wu, Bao, Qian, Sun, Yu, Mei, Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. Recently, benefiting from the deep learning technologies, a significant amount of research efforts have advanced the monocular human pose estimation both in 2D and 3D areas. Although there have been some works to summarize different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work from 2D to 3D. In this article, we provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem. First, we comprehensively summarize the 2D and 3D representations of human body. Then, we summarize the mainstream and milestone approaches for these human body presentations since the year 2014 under unified frameworks. Especially, we provide insightful analyses for the intrinsic connections and methods evolution from 2D to 3D pose estimation. Furthermore, we analyze the solutions for challenging cases, such as the lack of data, the inherent ambiguity between 2D and 3D, and the complex multi-person scenarios. Next, we summarize the benchmarks, evaluation metrics, and the quantitative performance of popular approaches. Finally, we discuss the challenges and give deep thinking of promising directions for future research. We believe this survey will provide the readers (researchers, engineers, developers, etc.) with a deep and insightful understanding of monocular human pose estimation.
ISSN:0360-0300
1557-7341
DOI:10.1145/3524497