TransPPG: two-stream transformer for remote heart rate estimate

Non-contact heart rate estimation using remote photoplethysmography (rPPG) has shown great potential in many applications and achieved creditable results in constrained scenarios. However, practical applications require results to be accurate even under complex environment. In this paper, we propose...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:CCF transactions on pervasive computing and interaction (Online) 2024-09, Vol.6 (3), p.271-280
Hauptverfasser: Kang, Jiaqi, Yang, Su, Zhang, Weishan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Non-contact heart rate estimation using remote photoplethysmography (rPPG) has shown great potential in many applications and achieved creditable results in constrained scenarios. However, practical applications require results to be accurate even under complex environment. In this paper, we propose a novel video embedding method that embeds each facial video into a feature map referred to as Multi-scale Adaptive Spatial and Temporal Map with Overlap (MAST_Mop), which contains not only vital information but also surrounding information, which acts as the mirror to figure out the homogeneous perturbations imposed on foreground and background simultaneously, such as illumination instability. Correspondingly, we propose a two-stream Transformer to map the MAST_Mop into heart rate (HR), where one stream follows the pulse signal in the facial area while the other figures out the perturbation signal from the surrounding region to enable feature-level subtraction between the two channels. Due to the context-aware overall feature embedding, the proposed approach outperforms all current state-of-the-art methods on two public datasets MAHNOB-HCI and VIPL-HR.
ISSN:2524-521X
2524-5228
DOI:10.1007/s42486-024-00158-9