KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation
This paper presents a novel Kinematics and Trajectory Prior Knowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness in existing transformer-based methods for 3D human pose estimation that the derivation of Q, K, V vectors in their self-attention mechanisms are all based on simple li...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents a novel Kinematics and Trajectory Prior
Knowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness in
existing transformer-based methods for 3D human pose estimation that the
derivation of Q, K, V vectors in their self-attention mechanisms are all based
on simple linear mapping. We propose two prior attention modules, namely
Kinematics Prior Attention (KPA) and Trajectory Prior Attention (TPA) to take
advantage of the known anatomical structure of the human body and motion
trajectory information, to facilitate effective learning of global dependencies
and features in the multi-head self-attention. KPA models kinematic
relationships in the human body by constructing a topology of kinematics, while
TPA builds a trajectory topology to learn the information of joint motion
trajectory across frames. Yielding Q, K, V vectors with prior knowledge, the
two modules enable KTPFormer to model both spatial and temporal correlations
simultaneously. Extensive experiments on three benchmarks (Human3.6M,
MPI-INF-3DHP and HumanEva) show that KTPFormer achieves superior performance in
comparison to state-of-the-art methods. More importantly, our KPA and TPA
modules have lightweight plug-and-play designs and can be integrated into
various transformer-based networks (i.e., diffusion-based) to improve the
performance with only a very small increase in the computational overhead. The
code is available at: https://github.com/JihuaPeng/KTPFormer. |
---|---|
DOI: | 10.48550/arxiv.2404.00658 |