TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction

Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2024-05, Vol.PP, p.1-12
Hauptverfasser:	Wang, Wenqian, Chang, Faliang, Liu, Chunsheng, Wang, Bin, Liu, Zehao
Format:	Artikel
Sprache:	eng
Schlagworte:	3-D early action prediction action recognition Predictive models Research and development Skeleton Solid modeling supervised contrastive leaning Task analysis temporally observed domain Transformers Vectors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	12
container_issue
container_start_page	1
container_title	IEEE transaction on neural networks and learning systems
container_volume	PP
creator	Wang, Wenqian Chang, Faliang Liu, Chunsheng Wang, Bin Liu, Zehao
description	Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.
doi_str_mv	10.1109/TNNLS.2024.3394254
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TNNLS_2024_3394254</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10530416</ieee_id><sourcerecordid>3055452533</sourcerecordid><originalsourceid>FETCH-LOGICAL-c275t-aeaddf1fa32c10ca7cc0c0dd36af6e89b97105faf4ee6425fc05bbfcfeb490de3</originalsourceid><addsrcrecordid>eNpNkMtKw0AUhgdRrNS-gIjM0k3qXHNxV9p6gdIIRuguTCZnIJp06kxa6dubXiyezTmL7__hfAjdUDKklCQP2Xw-ex8ywsSQ80QwKc7QFaMhCxiP4_PTHS16aOD9J-kmJDIUySXq8TgSXApxhRZZOkmDObSPOINmZZ2q6y1OCw9uAyWe2EZVSzy2y9Yp31YbwB37Y90XNtZhHkzwVLkuMdJtZZf4zUFZ7c9rdGFU7WFw3H308TTNxi_BLH1-HY9mgWaRbAMFqiwNNYozTYlWkdZEk7LkoTIhxEmRRJRIo4wACLsvjSayKIw2UIiElMD76P7Qu3L2ew2-zZvKa6hrtQS79jknUgrJJOcdyg6odtZ7ByZfuapRbptTku-k5nup-U5qfpTahe6O_euigfIU-VPYAbcHoAKAf42SE0FD_gs1GnyH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3055452533</pqid></control><display><type>article</type><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</creator><creatorcontrib>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</creatorcontrib><description>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</description><identifier>ISSN: 2162-237X</identifier><identifier>EISSN: 2162-2388</identifier><identifier>DOI: 10.1109/TNNLS.2024.3394254</identifier><identifier>PMID: 38743544</identifier><identifier>CODEN: ITNNAL</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>3-D early action prediction ; action recognition ; Predictive models ; Research and development ; Skeleton ; Solid modeling ; supervised contrastive leaning ; Task analysis ; temporally observed domain ; Transformers ; Vectors</subject><ispartof>IEEE transaction on neural networks and learning systems, 2024-05, Vol.PP, p.1-12</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-5516-2486 ; 0000-0003-0285-9786 ; 0000-0003-4618-1321 ; 0000-0003-1276-2267</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10530416$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10530416$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38743544$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Wenqian</creatorcontrib><creatorcontrib>Chang, Faliang</creatorcontrib><creatorcontrib>Liu, Chunsheng</creatorcontrib><creatorcontrib>Wang, Bin</creatorcontrib><creatorcontrib>Liu, Zehao</creatorcontrib><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><title>IEEE transaction on neural networks and learning systems</title><addtitle>TNNLS</addtitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><description>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</description><subject>3-D early action prediction</subject><subject>action recognition</subject><subject>Predictive models</subject><subject>Research and development</subject><subject>Skeleton</subject><subject>Solid modeling</subject><subject>supervised contrastive leaning</subject><subject>Task analysis</subject><subject>temporally observed domain</subject><subject>Transformers</subject><subject>Vectors</subject><issn>2162-237X</issn><issn>2162-2388</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtKw0AUhgdRrNS-gIjM0k3qXHNxV9p6gdIIRuguTCZnIJp06kxa6dubXiyezTmL7__hfAjdUDKklCQP2Xw-ex8ywsSQ80QwKc7QFaMhCxiP4_PTHS16aOD9J-kmJDIUySXq8TgSXApxhRZZOkmDObSPOINmZZ2q6y1OCw9uAyWe2EZVSzy2y9Yp31YbwB37Y90XNtZhHkzwVLkuMdJtZZf4zUFZ7c9rdGFU7WFw3H308TTNxi_BLH1-HY9mgWaRbAMFqiwNNYozTYlWkdZEk7LkoTIhxEmRRJRIo4wACLsvjSayKIw2UIiElMD76P7Qu3L2ew2-zZvKa6hrtQS79jknUgrJJOcdyg6odtZ7ByZfuapRbptTku-k5nup-U5qfpTahe6O_euigfIU-VPYAbcHoAKAf42SE0FD_gs1GnyH</recordid><startdate>20240514</startdate><enddate>20240514</enddate><creator>Wang, Wenqian</creator><creator>Chang, Faliang</creator><creator>Liu, Chunsheng</creator><creator>Wang, Bin</creator><creator>Liu, Zehao</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-5516-2486</orcidid><orcidid>https://orcid.org/0000-0003-0285-9786</orcidid><orcidid>https://orcid.org/0000-0003-4618-1321</orcidid><orcidid>https://orcid.org/0000-0003-1276-2267</orcidid></search><sort><creationdate>20240514</creationdate><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><author>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c275t-aeaddf1fa32c10ca7cc0c0dd36af6e89b97105faf4ee6425fc05bbfcfeb490de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>3-D early action prediction</topic><topic>action recognition</topic><topic>Predictive models</topic><topic>Research and development</topic><topic>Skeleton</topic><topic>Solid modeling</topic><topic>supervised contrastive leaning</topic><topic>Task analysis</topic><topic>temporally observed domain</topic><topic>Transformers</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Wang, Wenqian</creatorcontrib><creatorcontrib>Chang, Faliang</creatorcontrib><creatorcontrib>Liu, Chunsheng</creatorcontrib><creatorcontrib>Wang, Bin</creatorcontrib><creatorcontrib>Liu, Zehao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transaction on neural networks and learning systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Wenqian</au><au>Chang, Faliang</au><au>Liu, Chunsheng</au><au>Wang, Bin</au><au>Liu, Zehao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</atitle><jtitle>IEEE transaction on neural networks and learning systems</jtitle><stitle>TNNLS</stitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><date>2024-05-14</date><risdate>2024</risdate><volume>PP</volume><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>2162-237X</issn><eissn>2162-2388</eissn><coden>ITNNAL</coden><abstract>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>38743544</pmid><doi>10.1109/TNNLS.2024.3394254</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-5516-2486</orcidid><orcidid>https://orcid.org/0000-0003-0285-9786</orcidid><orcidid>https://orcid.org/0000-0003-4618-1321</orcidid><orcidid>https://orcid.org/0000-0003-1276-2267</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2162-237X
ispartof	IEEE transaction on neural networks and learning systems, 2024-05, Vol.PP, p.1-12
issn	2162-237X 2162-2388
language	eng
recordid	cdi_crossref_primary_10_1109_TNNLS_2024_3394254
source	IEEE Electronic Library (IEL)
subjects	3-D early action prediction action recognition Predictive models Research and development Skeleton Solid modeling supervised contrastive leaning Task analysis temporally observed domain Transformers Vectors
title	TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T05%3A55%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TODO-Net:%20Temporally%20Observed%20Domain%20Contrastive%20Network%20for%203-D%20Early%20Action%20Prediction&rft.jtitle=IEEE%20transaction%20on%20neural%20networks%20and%20learning%20systems&rft.au=Wang,%20Wenqian&rft.date=2024-05-14&rft.volume=PP&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=2162-237X&rft.eissn=2162-2388&rft.coden=ITNNAL&rft_id=info:doi/10.1109/TNNLS.2024.3394254&rft_dat=%3Cproquest_RIE%3E3055452533%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3055452533&rft_id=info:pmid/38743544&rft_ieee_id=10530416&rfr_iscdi=true