KD-Former: Kinematic and dynamic coupled transformer network for 3D human motion prediction

•We propose a novel non-autoregressive Kinematic and Dynamic coupled transFormer (KD-Former) network for 3D human motion prediction.•Our KD-Former leverages the complementary characteristics of motion kinematics and dynamics for performance improvements.•We formulate a simplified reduced-order dynam...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2023-11, Vol.143, p.109806, Article 109806
Hauptverfasser: Dai, Ju, Li, Hao, Zeng, Rui, Bai, Junxuan, Zhou, Feng, Pan, Junjun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We propose a novel non-autoregressive Kinematic and Dynamic coupled transFormer (KD-Former) network for 3D human motion prediction.•Our KD-Former leverages the complementary characteristics of motion kinematics and dynamics for performance improvements.•We formulate a simplified reduced-order dynamic algorithm, largely enhancing computation efficiency and prediction error. Extensive experiments on Human 3.6M and CMU MoCap datasets demonstrate the superior performance of our method. Recent studies have made remarkable progress on 3D human motion prediction by describing motion with kinematic knowledge. However, kinematics only considers the 3D positions or rotations of human skeletons, failing to reveal the physical characteristics of human motion. Motion dynamics reflects the forces between joints, explicitly encoding the skeleton topology, whereas rarely exploited in motion prediction. In this paper, we propose the Kinematic and Dynamic coupled transFormer (KD-Former), which incorporates dynamics with kinematics, to learn powerful features for high-fidelity motion prediction. Specifically, We first formulate a reduced-order dynamic model of human body to calculate the forces of all joints. Then we construct a non-autoregressive encoder-decoder framework based on the transformer structure. The encoder involves a kinematic encoder and a dynamic encoder, which are respectively responsible for extracting the kinematic and dynamic features for given history sequences via a spatial transformer and a temporal transformer. Future query sequences are decoded in parallel in the decoder by leveraging the encoded kinematic and dynamic information of history sequences. Experiments on Human3.6M and CMU MoCap benchmarks verify the effectiveness and superiority of our method. Code will be available at: https://github.com/wslh852/KD-Former.git.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.109806