Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast

Video-based remote physiological measurement utilizes facial videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements have been shown to achieve good performance. However, the drawback of these methods is that...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2024-08, Vol.46 (8), p.5835-5851
Hauptverfasser: Sun, Zhaodong, Li, Xiaobai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5851
container_issue 8
container_start_page 5835
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 46
creator Sun, Zhaodong
Li, Xiaobai
description Video-based remote physiological measurement utilizes facial videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements have been shown to achieve good performance. However, the drawback of these methods is that they require facial videos with ground truth (GT) physiological signals, which are often costly and difficult to obtain. In this paper, we propose Contrast-Phys+, a method that can be trained in both unsupervised and weakly-supervised settings. We employ a 3DCNN model to generate multiple spatiotemporal rPPG signals and incorporate prior knowledge of rPPG into a contrastive loss function. We further incorporate the GT signals into contrastive learning to adapt to partial or misaligned labels. The contrastive loss encourages rPPG/GT signals from the same video to be grouped together, while pushing those from different videos apart. We evaluate our methods on five publicly available datasets that include both RGB and Near-infrared videos. Contrast-Phys+ outperforms the state-of-the-art supervised methods, even when using partially available or misaligned GT signals, or no labels at all. Additionally, we highlight the advantages of our methods in terms of computational efficiency, noise robustness, and generalization.
doi_str_mv 10.1109/TPAMI.2024.3367910
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_38376970</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10440521</ieee_id><sourcerecordid>3075418791</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3110-ed4f8198af30e7a8f0408d0c283e9cabbc6fcd6f61fc5389dea879400bb138e93</originalsourceid><addsrcrecordid>eNpdkV9r1TAYh4Mo7mz6BUSk4I0wenzzp22yu3mYOthwuE0vQ5q-1cy2qUk7OPjll-M5m8OrhOT5_d6Eh5BXFJaUgnp_dXF8frpkwMSS87JSFJ6QBVVc5bzg6ilZAC1ZLiWTe2Q_xhsAKgrgz8kel7wqVQUL8mflhymYOOUXP9fx8Ci7HuI8Yrh1EZvMDE32Hc2vbp1f_jv95hr0-Qez2X_F3k-YbcLOd_6Hs6bLztHEOWCPw5TdOpNdjmZyCetHH9L1_cgX5Flruogvd-sBuf54crX6nJ99-XS6Oj7LLU__zLERraRKmpYDVka2IEA2YJnkqKypa1u2tinbkra24FI1aGSlBEBdUy5R8QPybts7Bv97xjjp3kWLXWcG9HPUTDFVCFAgE_r2P_TGz2FIr9McqkLQ1EwTxbaUDT7GgK0eg-tNWGsKeqNG_1WjN2r0Tk0KvdlVz3WPzUPk3kUCXm8Bh4iPGoWAglF-B2Q5lNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3075418791</pqid></control><display><type>article</type><title>Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast</title><source>IEEE Electronic Library (IEL)</source><creator>Sun, Zhaodong ; Li, Xiaobai</creator><creatorcontrib>Sun, Zhaodong ; Li, Xiaobai</creatorcontrib><description>Video-based remote physiological measurement utilizes facial videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements have been shown to achieve good performance. However, the drawback of these methods is that they require facial videos with ground truth (GT) physiological signals, which are often costly and difficult to obtain. In this paper, we propose Contrast-Phys+, a method that can be trained in both unsupervised and weakly-supervised settings. We employ a 3DCNN model to generate multiple spatiotemporal rPPG signals and incorporate prior knowledge of rPPG into a contrastive loss function. We further incorporate the GT signals into contrastive learning to adapt to partial or misaligned labels. The contrastive loss encourages rPPG/GT signals from the same video to be grouped together, while pushing those from different videos apart. We evaluate our methods on five publicly available datasets that include both RGB and Near-infrared videos. Contrast-Phys+ outperforms the state-of-the-art supervised methods, even when using partially available or misaligned GT signals, or no labels at all. Additionally, we highlight the advantages of our methods in terms of computational efficiency, noise robustness, and generalization.</description><identifier>ISSN: 0162-8828</identifier><identifier>ISSN: 1939-3539</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2024.3367910</identifier><identifier>PMID: 38376970</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Biomedical measurement ; Blood volume ; contrastive learning ; face video ; Faces ; Infrared imaging ; Labels ; Photoplethysmography ; Physiology ; Remote photoplethysmography ; Self-supervised learning ; semi-supervised learning ; Training ; unsupervised learning ; Video ; Videos ; weakly-supervised learning</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2024-08, Vol.46 (8), p.5835-5851</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3110-ed4f8198af30e7a8f0408d0c283e9cabbc6fcd6f61fc5389dea879400bb138e93</citedby><cites>FETCH-LOGICAL-c3110-ed4f8198af30e7a8f0408d0c283e9cabbc6fcd6f61fc5389dea879400bb138e93</cites><orcidid>0000-0002-0597-0765 ; 0000-0003-4519-7823</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10440521$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38376970$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Sun, Zhaodong</creatorcontrib><creatorcontrib>Li, Xiaobai</creatorcontrib><title>Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Video-based remote physiological measurement utilizes facial videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements have been shown to achieve good performance. However, the drawback of these methods is that they require facial videos with ground truth (GT) physiological signals, which are often costly and difficult to obtain. In this paper, we propose Contrast-Phys+, a method that can be trained in both unsupervised and weakly-supervised settings. We employ a 3DCNN model to generate multiple spatiotemporal rPPG signals and incorporate prior knowledge of rPPG into a contrastive loss function. We further incorporate the GT signals into contrastive learning to adapt to partial or misaligned labels. The contrastive loss encourages rPPG/GT signals from the same video to be grouped together, while pushing those from different videos apart. We evaluate our methods on five publicly available datasets that include both RGB and Near-infrared videos. Contrast-Phys+ outperforms the state-of-the-art supervised methods, even when using partially available or misaligned GT signals, or no labels at all. Additionally, we highlight the advantages of our methods in terms of computational efficiency, noise robustness, and generalization.</description><subject>Biomedical measurement</subject><subject>Blood volume</subject><subject>contrastive learning</subject><subject>face video</subject><subject>Faces</subject><subject>Infrared imaging</subject><subject>Labels</subject><subject>Photoplethysmography</subject><subject>Physiology</subject><subject>Remote photoplethysmography</subject><subject>Self-supervised learning</subject><subject>semi-supervised learning</subject><subject>Training</subject><subject>unsupervised learning</subject><subject>Video</subject><subject>Videos</subject><subject>weakly-supervised learning</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><recordid>eNpdkV9r1TAYh4Mo7mz6BUSk4I0wenzzp22yu3mYOthwuE0vQ5q-1cy2qUk7OPjll-M5m8OrhOT5_d6Eh5BXFJaUgnp_dXF8frpkwMSS87JSFJ6QBVVc5bzg6ilZAC1ZLiWTe2Q_xhsAKgrgz8kel7wqVQUL8mflhymYOOUXP9fx8Ci7HuI8Yrh1EZvMDE32Hc2vbp1f_jv95hr0-Qez2X_F3k-YbcLOd_6Hs6bLztHEOWCPw5TdOpNdjmZyCetHH9L1_cgX5Flruogvd-sBuf54crX6nJ99-XS6Oj7LLU__zLERraRKmpYDVka2IEA2YJnkqKypa1u2tinbkra24FI1aGSlBEBdUy5R8QPybts7Bv97xjjp3kWLXWcG9HPUTDFVCFAgE_r2P_TGz2FIr9McqkLQ1EwTxbaUDT7GgK0eg-tNWGsKeqNG_1WjN2r0Tk0KvdlVz3WPzUPk3kUCXm8Bh4iPGoWAglF-B2Q5lNA</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Sun, Zhaodong</creator><creator>Li, Xiaobai</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-0597-0765</orcidid><orcidid>https://orcid.org/0000-0003-4519-7823</orcidid></search><sort><creationdate>20240801</creationdate><title>Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast</title><author>Sun, Zhaodong ; Li, Xiaobai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3110-ed4f8198af30e7a8f0408d0c283e9cabbc6fcd6f61fc5389dea879400bb138e93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Biomedical measurement</topic><topic>Blood volume</topic><topic>contrastive learning</topic><topic>face video</topic><topic>Faces</topic><topic>Infrared imaging</topic><topic>Labels</topic><topic>Photoplethysmography</topic><topic>Physiology</topic><topic>Remote photoplethysmography</topic><topic>Self-supervised learning</topic><topic>semi-supervised learning</topic><topic>Training</topic><topic>unsupervised learning</topic><topic>Video</topic><topic>Videos</topic><topic>weakly-supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sun, Zhaodong</creatorcontrib><creatorcontrib>Li, Xiaobai</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sun, Zhaodong</au><au>Li, Xiaobai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2024-08-01</date><risdate>2024</risdate><volume>46</volume><issue>8</issue><spage>5835</spage><epage>5851</epage><pages>5835-5851</pages><issn>0162-8828</issn><issn>1939-3539</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Video-based remote physiological measurement utilizes facial videos to measure the blood volume change signal, which is also called remote photoplethysmography (rPPG). Supervised methods for rPPG measurements have been shown to achieve good performance. However, the drawback of these methods is that they require facial videos with ground truth (GT) physiological signals, which are often costly and difficult to obtain. In this paper, we propose Contrast-Phys+, a method that can be trained in both unsupervised and weakly-supervised settings. We employ a 3DCNN model to generate multiple spatiotemporal rPPG signals and incorporate prior knowledge of rPPG into a contrastive loss function. We further incorporate the GT signals into contrastive learning to adapt to partial or misaligned labels. The contrastive loss encourages rPPG/GT signals from the same video to be grouped together, while pushing those from different videos apart. We evaluate our methods on five publicly available datasets that include both RGB and Near-infrared videos. Contrast-Phys+ outperforms the state-of-the-art supervised methods, even when using partially available or misaligned GT signals, or no labels at all. Additionally, we highlight the advantages of our methods in terms of computational efficiency, noise robustness, and generalization.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>38376970</pmid><doi>10.1109/TPAMI.2024.3367910</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-0597-0765</orcidid><orcidid>https://orcid.org/0000-0003-4519-7823</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2024-08, Vol.46 (8), p.5835-5851
issn 0162-8828
1939-3539
1939-3539
2160-9292
language eng
recordid cdi_pubmed_primary_38376970
source IEEE Electronic Library (IEL)
subjects Biomedical measurement
Blood volume
contrastive learning
face video
Faces
Infrared imaging
Labels
Photoplethysmography
Physiology
Remote photoplethysmography
Self-supervised learning
semi-supervised learning
Training
unsupervised learning
Video
Videos
weakly-supervised learning
title Contrast-Phys+: Unsupervised and Weakly-Supervised Video-Based Remote Physiological Measurement via Spatiotemporal Contrast
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T05%3A49%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Contrast-Phys+:%20Unsupervised%20and%20Weakly-Supervised%20Video-Based%20Remote%20Physiological%20Measurement%20via%20Spatiotemporal%20Contrast&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Sun,%20Zhaodong&rft.date=2024-08-01&rft.volume=46&rft.issue=8&rft.spage=5835&rft.epage=5851&rft.pages=5835-5851&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2024.3367910&rft_dat=%3Cproquest_pubme%3E3075418791%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3075418791&rft_id=info:pmid/38376970&rft_ieee_id=10440521&rfr_iscdi=true