Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures
Current action recognition studies enjoy the benefits of two neural network branches, spatial and temporal. This work aims to extend the previous work by introducing a fusion of spatial and temporal branches to provide superior action recognition capability toward multi-label multi-class classificat...
Gespeichert in:
Veröffentlicht in: | IEEE access 2020, Vol.8, p.173566-173575 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 173575 |
---|---|
container_issue | |
container_start_page | 173566 |
container_title | IEEE access |
container_volume | 8 |
creator | Joefrie, Yuri Yudhaswana Aono, Masaki |
description | Current action recognition studies enjoy the benefits of two neural network branches, spatial and temporal. This work aims to extend the previous work by introducing a fusion of spatial and temporal branches to provide superior action recognition capability toward multi-label multi-class classification problems. In this paper, we propose three fusion models with different fusion strategies. We first build several efficient temporal Gaussian mixture (TGM) layers to form spatial and temporal branches to learn a set of features. In addition to these branches, we introduce a new deep spatio-temporal branch consisting of a series of TGM layers to learn the features that emerged from the existing branches. Each branch produces a temporal-aware feature that assists the model in understanding the underlying action in a video. To verify the performance of our proposed models, we performed extensive experiments using the well-known MultiTHUMOS benchmarking dataset. The results demonstrate the importance of our proposed deep fusion mechanism, contributing to the overall score while keeping the number of parameters small. |
doi_str_mv | 10.1109/ACCESS.2020.3025931 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2454679143</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9203837</ieee_id><doaj_id>oai_doaj_org_article_6cef1e3072d84e7a803095cdc00b79b8</doaj_id><sourcerecordid>2454679143</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-e62093fea4295d277f9b5ae6ef944f4134f16708cf036a9fea0bf343808bdb893</originalsourceid><addsrcrecordid>eNpNUU1r3DAUNKWBhjS_IBdBz94-fdiSjls3TQMbCtmUHoUsP6VanJUr2dD8-ypxWPoubxjNzBNMVV1R2FAK-vO26673-w0DBhsOrNGcvqvOGW11zRvevv8Pf6gucz5AGVWoRp5X8W4Z51DvbI8jWXE32pzJ1s0hHsk9uvh4DK_4V5h_k6-IE9lPtjD1Az5NMdmR7Owzpky-2IwDKcrTw41dcg72SO7C33lJmD9WZ96OGS_f9kX189v1Q_e93v24ue22u9oJUHONLQPNPVrBdDMwKb3uG4stei2EF5QLT1sJynngrdVFCL3ngitQ_dArzS-q2zV3iPZgphSebHo20QbzSsT0aGyagxvRtA49RQ6SDUqgtAo46MYNDqCXulcl69OaNaX4Z8E8m0Nc0rF83zDRiFZqKnhR8VXlUsw5oT9dpWBeijJrUealKPNWVHFdra6AiCeHZsAVl_wfCBKOzQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2454679143</pqid></control><display><type>article</type><title>Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Joefrie, Yuri Yudhaswana ; Aono, Masaki</creator><creatorcontrib>Joefrie, Yuri Yudhaswana ; Aono, Masaki</creatorcontrib><description>Current action recognition studies enjoy the benefits of two neural network branches, spatial and temporal. This work aims to extend the previous work by introducing a fusion of spatial and temporal branches to provide superior action recognition capability toward multi-label multi-class classification problems. In this paper, we propose three fusion models with different fusion strategies. We first build several efficient temporal Gaussian mixture (TGM) layers to form spatial and temporal branches to learn a set of features. In addition to these branches, we introduce a new deep spatio-temporal branch consisting of a series of TGM layers to learn the features that emerged from the existing branches. Each branch produces a temporal-aware feature that assists the model in understanding the underlying action in a video. To verify the performance of our proposed models, we performed extensive experiments using the well-known MultiTHUMOS benchmarking dataset. The results demonstrate the importance of our proposed deep fusion mechanism, contributing to the overall score while keeping the number of parameters small.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3025931</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Action recognition ; Computer architecture ; Convolution ; Kernel ; motion detection ; multi-branch network ; multi-layer neural network ; Neural networks ; Optical imaging ; Recognition ; spatio-temporal branch ; Three-dimensional displays ; Two dimensional displays ; videos</subject><ispartof>IEEE access, 2020, Vol.8, p.173566-173575</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-e62093fea4295d277f9b5ae6ef944f4134f16708cf036a9fea0bf343808bdb893</citedby><cites>FETCH-LOGICAL-c408t-e62093fea4295d277f9b5ae6ef944f4134f16708cf036a9fea0bf343808bdb893</cites><orcidid>0000-0002-2618-4667</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9203837$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2095,4009,27612,27902,27903,27904,54911</link.rule.ids></links><search><creatorcontrib>Joefrie, Yuri Yudhaswana</creatorcontrib><creatorcontrib>Aono, Masaki</creatorcontrib><title>Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures</title><title>IEEE access</title><addtitle>Access</addtitle><description>Current action recognition studies enjoy the benefits of two neural network branches, spatial and temporal. This work aims to extend the previous work by introducing a fusion of spatial and temporal branches to provide superior action recognition capability toward multi-label multi-class classification problems. In this paper, we propose three fusion models with different fusion strategies. We first build several efficient temporal Gaussian mixture (TGM) layers to form spatial and temporal branches to learn a set of features. In addition to these branches, we introduce a new deep spatio-temporal branch consisting of a series of TGM layers to learn the features that emerged from the existing branches. Each branch produces a temporal-aware feature that assists the model in understanding the underlying action in a video. To verify the performance of our proposed models, we performed extensive experiments using the well-known MultiTHUMOS benchmarking dataset. The results demonstrate the importance of our proposed deep fusion mechanism, contributing to the overall score while keeping the number of parameters small.</description><subject>Action recognition</subject><subject>Computer architecture</subject><subject>Convolution</subject><subject>Kernel</subject><subject>motion detection</subject><subject>multi-branch network</subject><subject>multi-layer neural network</subject><subject>Neural networks</subject><subject>Optical imaging</subject><subject>Recognition</subject><subject>spatio-temporal branch</subject><subject>Three-dimensional displays</subject><subject>Two dimensional displays</subject><subject>videos</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1r3DAUNKWBhjS_IBdBz94-fdiSjls3TQMbCtmUHoUsP6VanJUr2dD8-ypxWPoubxjNzBNMVV1R2FAK-vO26673-w0DBhsOrNGcvqvOGW11zRvevv8Pf6gucz5AGVWoRp5X8W4Z51DvbI8jWXE32pzJ1s0hHsk9uvh4DK_4V5h_k6-IE9lPtjD1Az5NMdmR7Owzpky-2IwDKcrTw41dcg72SO7C33lJmD9WZ96OGS_f9kX189v1Q_e93v24ue22u9oJUHONLQPNPVrBdDMwKb3uG4stei2EF5QLT1sJynngrdVFCL3ngitQ_dArzS-q2zV3iPZgphSebHo20QbzSsT0aGyagxvRtA49RQ6SDUqgtAo46MYNDqCXulcl69OaNaX4Z8E8m0Nc0rF83zDRiFZqKnhR8VXlUsw5oT9dpWBeijJrUealKPNWVHFdra6AiCeHZsAVl_wfCBKOzQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Joefrie, Yuri Yudhaswana</creator><creator>Aono, Masaki</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-2618-4667</orcidid></search><sort><creationdate>2020</creationdate><title>Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures</title><author>Joefrie, Yuri Yudhaswana ; Aono, Masaki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-e62093fea4295d277f9b5ae6ef944f4134f16708cf036a9fea0bf343808bdb893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Action recognition</topic><topic>Computer architecture</topic><topic>Convolution</topic><topic>Kernel</topic><topic>motion detection</topic><topic>multi-branch network</topic><topic>multi-layer neural network</topic><topic>Neural networks</topic><topic>Optical imaging</topic><topic>Recognition</topic><topic>spatio-temporal branch</topic><topic>Three-dimensional displays</topic><topic>Two dimensional displays</topic><topic>videos</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Joefrie, Yuri Yudhaswana</creatorcontrib><creatorcontrib>Aono, Masaki</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Joefrie, Yuri Yudhaswana</au><au>Aono, Masaki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>173566</spage><epage>173575</epage><pages>173566-173575</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Current action recognition studies enjoy the benefits of two neural network branches, spatial and temporal. This work aims to extend the previous work by introducing a fusion of spatial and temporal branches to provide superior action recognition capability toward multi-label multi-class classification problems. In this paper, we propose three fusion models with different fusion strategies. We first build several efficient temporal Gaussian mixture (TGM) layers to form spatial and temporal branches to learn a set of features. In addition to these branches, we introduce a new deep spatio-temporal branch consisting of a series of TGM layers to learn the features that emerged from the existing branches. Each branch produces a temporal-aware feature that assists the model in understanding the underlying action in a video. To verify the performance of our proposed models, we performed extensive experiments using the well-known MultiTHUMOS benchmarking dataset. The results demonstrate the importance of our proposed deep fusion mechanism, contributing to the overall score while keeping the number of parameters small.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3025931</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-2618-4667</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2020, Vol.8, p.173566-173575 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2454679143 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Action recognition Computer architecture Convolution Kernel motion detection multi-branch network multi-layer neural network Neural networks Optical imaging Recognition spatio-temporal branch Three-dimensional displays Two dimensional displays videos |
title | Multi-Label Multi-Class Action Recognition With Deep Spatio-Temporal Layers Based on Temporal Gaussian Mixtures |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T19%3A05%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multi-Label%20Multi-Class%20Action%20Recognition%20With%20Deep%20Spatio-Temporal%20Layers%20Based%20on%20Temporal%20Gaussian%20Mixtures&rft.jtitle=IEEE%20access&rft.au=Joefrie,%20Yuri%20Yudhaswana&rft.date=2020&rft.volume=8&rft.spage=173566&rft.epage=173575&rft.pages=173566-173575&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3025931&rft_dat=%3Cproquest_ieee_%3E2454679143%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2454679143&rft_id=info:pmid/&rft_ieee_id=9203837&rft_doaj_id=oai_doaj_org_article_6cef1e3072d84e7a803095cdc00b79b8&rfr_iscdi=true |