Recovering Human Pose and Shape From Through-the-Wall Radar Images

Although the through-the-wall radar imaging (TWRI) system working in the appropriate frequency band can penetrate the nonmetallic obstacles and sense the targets behind, its low imaging spatial resolution hinders the acquisition of more detailed information, such as human pose and shape. This articl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2022, Vol.60, p.1-15
Hauptverfasser: Zheng, Zhijie, Pan, Jun, Ni, Zhikang, Shi, Cheng, Zhang, Diankun, Liu, Xiaojun, Fang, Guangyou
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Although the through-the-wall radar imaging (TWRI) system working in the appropriate frequency band can penetrate the nonmetallic obstacles and sense the targets behind, its low imaging spatial resolution hinders the acquisition of more detailed information, such as human pose and shape. This article mainly discusses a deep learning-based human pose and shape recovery method from TWRI images. Inspired by cross-modal learning, the method follows a teacher-student learning pipeline that avoids the heavy cost of manual labeling. Specifically, a camera is attached to the self-develop radar system to simultaneously capture paired red-green-blue (RGB) images and TWRI images in a scenario without wall occlusion. A pose estimation framework (Hourglass) and a semantic segmentation framework (UNet) serve as the teacher network to convert the RGB images into the pose keypoints and the shape masks. By taking inspiration from the topological architecture of these frameworks, a student network radar pose shape network (RPSNet) is designed to extract the information from the corresponding radar images and predict the keypoints and masks that are close to the results above. Instead of learning two single-task objectives independently, multitasking learning is introduced to adaptatively learn common features. When applied to wall-occlusive scenarios, only the radar images are collected and fed into the student network for pose and shape recovery. The advantages of this method over computer vision-based methods for human recovery are demonstrated in scenarios both without and with wall occlusion.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2022.3162333