Video-Based Multiphysiological Disentanglement and Remote Robust Estimation for Respiration
Remote noncontact respiratory rate estimation by facial visual information has great research significance, providing valuable priors for health monitoring, clinical diagnosis, and anti-fraud. However, existing studies suffer from disturbances in epidermal specular reflections induced by head moveme...
Gespeichert in:
Veröffentlicht in: | IEEE transaction on neural networks and learning systems 2024-07, Vol.PP, p.1-12 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Remote noncontact respiratory rate estimation by facial visual information has great research significance, providing valuable priors for health monitoring, clinical diagnosis, and anti-fraud. However, existing studies suffer from disturbances in epidermal specular reflections induced by head movements and facial expressions. Furthermore, diffuse reflections of light in the skin-colored subcutaneous tissue caused by multiple time-varying physiological signals independent of breathing are entangled with the intention of the respiratory process, leading to confusion in current research. To address these issues, this article proposes a novel network for natural light video-based remote respiration estimation. Specifically, our model consists of a two-stage architecture that progressively implements vital measurements. The first stage adopts an encoder-decoder structure to recharacterize the facial motion frame differences of the input video based on the gradient binary state of the respiratory signal during inspiration and expiration. Then, the obtained generative mapping, which is disentangled from various time-varying interferences and is only linearly related to the respiratory state, is combined with the facial appearance in the second stage. To further improve the robustness of our algorithm, we design a targeted long-term temporal attention module and embed it between the two stages to enhance the network's ability to model the breathing cycle that occupies ultra many frames and to mine hidden timing change clues. We train and validate the proposed network on a series of publicly available respiration estimation datasets, and the experimental results demonstrate its competitiveness against the state-of-the-art breathing and physiological prediction frameworks. |
---|---|
ISSN: | 2162-237X 2162-2388 2162-2388 |
DOI: | 10.1109/TNNLS.2024.3424772 |