Synergetic reconstruction from 2D pose and 3D motion for wide-space multi-person video motion capture in the wild
Although many studies have investigated markerless motion capture, the technology has not been applied to real sports or concerts. In this paper, we propose a markerless motion capture method with spatiotemporal accuracy and smoothness from multiple cameras in wide-space and multi-person environment...
Gespeichert in:
Veröffentlicht in: | Image and vision computing 2020-12, Vol.104, p.104028, Article 104028 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Although many studies have investigated markerless motion capture, the technology has not been applied to real sports or concerts. In this paper, we propose a markerless motion capture method with spatiotemporal accuracy and smoothness from multiple cameras in wide-space and multi-person environments. The proposed method predicts each person's 3D pose and determines the bounding box of multi-camera images small enough. This prediction and spatiotemporal filtering based on human skeletal model enables 3D reconstruction of the person and demonstrates high-accuracy. The accurate 3D reconstruction is then used to predict the bounding box of each camera image in the next frame. This is feedback from the 3D motion to 2D pose, and provides a synergetic effect on the overall performance of video motion capture. We evaluated the proposed method using various datasets and a real sports field. The experimental results demonstrate that the mean per joint position error (MPJPE) is 31.5 mm and the percentage of correct parts (PCP) is 99.5% for five people dynamically moving while satisfying the range of motion (RoM). Video demonstration, datasets, and additional materials are posted on our project page1.
•A method to realize multi-person motion capture was proposed.•The proposed method works even in a wide field using cameras with different fields of view placed at a single viewpoint.•The proposed method achieved 31.5 mm in MPJPE and 99.5% in PCP in an environment where 5 people move dynamically while satisfying RoM.•With the proposed method, all players' detailed motions in a futsal game were acquired only from a few cameras. |
---|---|
ISSN: | 0262-8856 1872-8138 |
DOI: | 10.1016/j.imavis.2020.104028 |