TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction
Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus...
Gespeichert in:
Veröffentlicht in: | IEEE transaction on neural networks and learning systems 2024-05, Vol.PP, p.1-12 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 12 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | IEEE transaction on neural networks and learning systems |
container_volume | PP |
creator | Wang, Wenqian Chang, Faliang Liu, Chunsheng Wang, Bin Liu, Zehao |
description | Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net. |
doi_str_mv | 10.1109/TNNLS.2024.3394254 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TNNLS_2024_3394254</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10530416</ieee_id><sourcerecordid>3055452533</sourcerecordid><originalsourceid>FETCH-LOGICAL-c275t-aeaddf1fa32c10ca7cc0c0dd36af6e89b97105faf4ee6425fc05bbfcfeb490de3</originalsourceid><addsrcrecordid>eNpNkMtKw0AUhgdRrNS-gIjM0k3qXHNxV9p6gdIIRuguTCZnIJp06kxa6dubXiyezTmL7__hfAjdUDKklCQP2Xw-ex8ywsSQ80QwKc7QFaMhCxiP4_PTHS16aOD9J-kmJDIUySXq8TgSXApxhRZZOkmDObSPOINmZZ2q6y1OCw9uAyWe2EZVSzy2y9Yp31YbwB37Y90XNtZhHkzwVLkuMdJtZZf4zUFZ7c9rdGFU7WFw3H308TTNxi_BLH1-HY9mgWaRbAMFqiwNNYozTYlWkdZEk7LkoTIhxEmRRJRIo4wACLsvjSayKIw2UIiElMD76P7Qu3L2ew2-zZvKa6hrtQS79jknUgrJJOcdyg6odtZ7ByZfuapRbptTku-k5nup-U5qfpTahe6O_euigfIU-VPYAbcHoAKAf42SE0FD_gs1GnyH</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3055452533</pqid></control><display><type>article</type><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</creator><creatorcontrib>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</creatorcontrib><description>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</description><identifier>ISSN: 2162-237X</identifier><identifier>EISSN: 2162-2388</identifier><identifier>DOI: 10.1109/TNNLS.2024.3394254</identifier><identifier>PMID: 38743544</identifier><identifier>CODEN: ITNNAL</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>3-D early action prediction ; action recognition ; Predictive models ; Research and development ; Skeleton ; Solid modeling ; supervised contrastive leaning ; Task analysis ; temporally observed domain ; Transformers ; Vectors</subject><ispartof>IEEE transaction on neural networks and learning systems, 2024-05, Vol.PP, p.1-12</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-5516-2486 ; 0000-0003-0285-9786 ; 0000-0003-4618-1321 ; 0000-0003-1276-2267</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10530416$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10530416$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38743544$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Wenqian</creatorcontrib><creatorcontrib>Chang, Faliang</creatorcontrib><creatorcontrib>Liu, Chunsheng</creatorcontrib><creatorcontrib>Wang, Bin</creatorcontrib><creatorcontrib>Liu, Zehao</creatorcontrib><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><title>IEEE transaction on neural networks and learning systems</title><addtitle>TNNLS</addtitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><description>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</description><subject>3-D early action prediction</subject><subject>action recognition</subject><subject>Predictive models</subject><subject>Research and development</subject><subject>Skeleton</subject><subject>Solid modeling</subject><subject>supervised contrastive leaning</subject><subject>Task analysis</subject><subject>temporally observed domain</subject><subject>Transformers</subject><subject>Vectors</subject><issn>2162-237X</issn><issn>2162-2388</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtKw0AUhgdRrNS-gIjM0k3qXHNxV9p6gdIIRuguTCZnIJp06kxa6dubXiyezTmL7__hfAjdUDKklCQP2Xw-ex8ywsSQ80QwKc7QFaMhCxiP4_PTHS16aOD9J-kmJDIUySXq8TgSXApxhRZZOkmDObSPOINmZZ2q6y1OCw9uAyWe2EZVSzy2y9Yp31YbwB37Y90XNtZhHkzwVLkuMdJtZZf4zUFZ7c9rdGFU7WFw3H308TTNxi_BLH1-HY9mgWaRbAMFqiwNNYozTYlWkdZEk7LkoTIhxEmRRJRIo4wACLsvjSayKIw2UIiElMD76P7Qu3L2ew2-zZvKa6hrtQS79jknUgrJJOcdyg6odtZ7ByZfuapRbptTku-k5nup-U5qfpTahe6O_euigfIU-VPYAbcHoAKAf42SE0FD_gs1GnyH</recordid><startdate>20240514</startdate><enddate>20240514</enddate><creator>Wang, Wenqian</creator><creator>Chang, Faliang</creator><creator>Liu, Chunsheng</creator><creator>Wang, Bin</creator><creator>Liu, Zehao</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-5516-2486</orcidid><orcidid>https://orcid.org/0000-0003-0285-9786</orcidid><orcidid>https://orcid.org/0000-0003-4618-1321</orcidid><orcidid>https://orcid.org/0000-0003-1276-2267</orcidid></search><sort><creationdate>20240514</creationdate><title>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</title><author>Wang, Wenqian ; Chang, Faliang ; Liu, Chunsheng ; Wang, Bin ; Liu, Zehao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c275t-aeaddf1fa32c10ca7cc0c0dd36af6e89b97105faf4ee6425fc05bbfcfeb490de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>3-D early action prediction</topic><topic>action recognition</topic><topic>Predictive models</topic><topic>Research and development</topic><topic>Skeleton</topic><topic>Solid modeling</topic><topic>supervised contrastive leaning</topic><topic>Task analysis</topic><topic>temporally observed domain</topic><topic>Transformers</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Wang, Wenqian</creatorcontrib><creatorcontrib>Chang, Faliang</creatorcontrib><creatorcontrib>Liu, Chunsheng</creatorcontrib><creatorcontrib>Wang, Bin</creatorcontrib><creatorcontrib>Liu, Zehao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transaction on neural networks and learning systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Wenqian</au><au>Chang, Faliang</au><au>Liu, Chunsheng</au><au>Wang, Bin</au><au>Liu, Zehao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction</atitle><jtitle>IEEE transaction on neural networks and learning systems</jtitle><stitle>TNNLS</stitle><addtitle>IEEE Trans Neural Netw Learn Syst</addtitle><date>2024-05-14</date><risdate>2024</risdate><volume>PP</volume><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>2162-237X</issn><eissn>2162-2388</eissn><coden>ITNNAL</coden><abstract>Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>38743544</pmid><doi>10.1109/TNNLS.2024.3394254</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-5516-2486</orcidid><orcidid>https://orcid.org/0000-0003-0285-9786</orcidid><orcidid>https://orcid.org/0000-0003-4618-1321</orcidid><orcidid>https://orcid.org/0000-0003-1276-2267</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2162-237X |
ispartof | IEEE transaction on neural networks and learning systems, 2024-05, Vol.PP, p.1-12 |
issn | 2162-237X 2162-2388 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TNNLS_2024_3394254 |
source | IEEE Electronic Library (IEL) |
subjects | 3-D early action prediction action recognition Predictive models Research and development Skeleton Solid modeling supervised contrastive leaning Task analysis temporally observed domain Transformers Vectors |
title | TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T05%3A55%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TODO-Net:%20Temporally%20Observed%20Domain%20Contrastive%20Network%20for%203-D%20Early%20Action%20Prediction&rft.jtitle=IEEE%20transaction%20on%20neural%20networks%20and%20learning%20systems&rft.au=Wang,%20Wenqian&rft.date=2024-05-14&rft.volume=PP&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=2162-237X&rft.eissn=2162-2388&rft.coden=ITNNAL&rft_id=info:doi/10.1109/TNNLS.2024.3394254&rft_dat=%3Cproquest_RIE%3E3055452533%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3055452533&rft_id=info:pmid/38743544&rft_ieee_id=10530416&rfr_iscdi=true |