Towards High Performance Human Keypoint Detection

Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination, and scale variance. In this paper, we address this problem from three aspects by devising an efficient network structure, proposing three effective training strategies, and exploiting four useful p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of computer vision 2021-09, Vol.129 (9), p.2639-2662
Hauptverfasser:	Zhang, Jing, Chen, Zhe, Tao, Dacheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Annotations Artificial Intelligence Benchmarks Computer Imaging Computer Science Computer Science, Artificial Intelligence Context Datasets Human performance Image Processing and Computer Vision Occlusion Pattern Recognition Pattern Recognition and Graphics Science & Technology Source code Technology Training Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination, and scale variance. In this paper, we address this problem from three aspects by devising an efficient network structure, proposing three effective training strategies, and exploiting four useful postprocessing techniques. First, we find that context information plays an important role in reasoning human body configuration and invisible keypoints. Inspired by this, we propose a cascaded context mixer (CCM), which efficiently integrates spatial and channel context information and progressively refines them. Then, to maximize CCM’s representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy by exploiting abundant unlabeled data. It enables CCM to learn discriminative features from massive diverse poses. Third, we present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy. Extensive experiments on the MS COCO keypoint detection benchmark demonstrate the superiority of the proposed method over representative state-of-the-art (SOTA) methods. Our single model achieves comparable performance with the winner of the 2018 COCO Keypoint Detection Challenge. The final ensemble model sets a new SOTA on this benchmark. The source code will be released at https://github.com/chaimi2013/CCM .
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-021-01482-8