mmHPE: Robust Multiscale 3-D Human Pose Estimation Using a Single mmWave Radar
Nowadays, human pose estimation (HPE) is widely used in several application areas. The current mainstream method based on vision suffers from privacy leakage and relies on lighting conditions. To adopt a more privacy-preserving and pervasive HPE approach, recent studies have implemented 3-D HPE usin...
Gespeichert in:
Veröffentlicht in: | IEEE internet of things journal 2025-01, Vol.12 (1), p.1032 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Nowadays, human pose estimation (HPE) is widely used in several application areas. The current mainstream method based on vision suffers from privacy leakage and relies on lighting conditions. To adopt a more privacy-preserving and pervasive HPE approach, recent studies have implemented 3-D HPE using commodity radio frequency (RF) signals. However, RF-based HPE faces issues, such as resolution limitations and complex data processing, which makes it challenging to extract and utilize multiscale human activity features. In this article, we propose mmHPE, a novel approach to detect and reconstruct 3-D human posture in multiscale scenarios using a single millimeter wave radar. mmHPE consists of three main parts. Specifically, we develop a 3-D target detection network (TDN) and design an optimized loss function for it to enhance its 3-D target bounding box (BBox) detection capability in radar 3-D space. Next, an enhanced point cloud generator (EPCG) algorithm based on the 3-D target BBox is proposed to generate a stable and accurate point cloud of the target. Furthermore, we design a multiscale coarse-fine HPE network (CFN) ranging from approximate to precise estimation for reconstructing a 3-D skeleton from point cloud data. Extensive experiments demonstrate that our method surpasses other methods for 3-D human pose reconstruction in multiscale scenes, with an average error of 4.50 cm. Our method is robust enough to accurately estimate the target pose even in occluded or low-light scenes. |
---|---|
ISSN: | 2327-4662 |
DOI: | 10.1109/JIOT.2024.3476350 |