3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information

This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2022-07, Vol.81 (17), p.24119-24143
Hauptverfasser:	Sánchez-Caballero, Adrián, de López-Diz, Sergio, Fuentes-Jimenez, David, Losada-Gutiérrez, Cristina, Marrón-Romera, Marta, Casillas-Pérez, David, Sarker, Mohammad Ibrahim
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Computer Communication Networks Computer Science Data Structures and Information Theory Human activity recognition Human motion Multimedia Information Systems Neural networks Object recognition Real time Special Purpose and Application-Based Systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	24143
container_issue	17
container_start_page	24119
container_title	Multimedia tools and applications
container_volume	81
creator	Sánchez-Caballero, Adrián de López-Diz, Sergio Fuentes-Jimenez, David Losada-Gutiérrez, Cristina Marrón-Romera, Marta Casillas-Pérez, David Sarker, Mohammad Ibrahim
description	This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people’s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.
doi_str_mv	10.1007/s11042-022-12091-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2682578159</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2682578159</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-a511eddda2e84ea2b14c102c91b6a1415588d7b2416f6d8a8bce04837dd07d023</originalsourceid><addsrcrecordid>eNp9kMFOwzAQRC0EEqXwA5wicTbs2nHsckOFAlJVLnBElhM7JaVNgp2ool-P2yBx47Sj3Xmz0hByiXCNAPImIELKKDBGkcEE6e6IjFBITqVkeBw1V0ClADwlZyGsADATLB2Rd34_my4Wt4l3Zk27auMSU3RVU8dF0Szr6qD7UNXLhN8n1rk2qV3vzTqObtv4z5Bsq-4j8WYbr21UVV02fmP24Dk5Kc06uIvfOSZvs4fX6ROdvzw-T-_mtOAZ76gRiM5aa5hTqTMsx7RAYMUE88xgikIoZWXOUszKzCqj8sJBqri0FqQFxsfkashtffPVu9DpVdP7Or7ULFNMSIViEl1scBW-CcG7Ure-2hj_rRH0vkY91KhjjfpQo95FiA9QiOZ66fxf9D_UD1Kwdao</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2682578159</pqid></control><display><type>article</type><title>3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information</title><source>SpringerLink Journals</source><creator>Sánchez-Caballero, Adrián ; de López-Diz, Sergio ; Fuentes-Jimenez, David ; Losada-Gutiérrez, Cristina ; Marrón-Romera, Marta ; Casillas-Pérez, David ; Sarker, Mohammad Ibrahim</creator><creatorcontrib>Sánchez-Caballero, Adrián ; de López-Diz, Sergio ; Fuentes-Jimenez, David ; Losada-Gutiérrez, Cristina ; Marrón-Romera, Marta ; Casillas-Pérez, David ; Sarker, Mohammad Ibrahim</creatorcontrib><description>This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people’s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-022-12091-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial neural networks ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Human activity recognition ; Human motion ; Multimedia Information Systems ; Neural networks ; Object recognition ; Real time ; Special Purpose and Application-Based Systems</subject><ispartof>Multimedia tools and applications, 2022-07, Vol.81 (17), p.24119-24143</ispartof><rights>The Author(s) 2022</rights><rights>The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-a511eddda2e84ea2b14c102c91b6a1415588d7b2416f6d8a8bce04837dd07d023</citedby><cites>FETCH-LOGICAL-c363t-a511eddda2e84ea2b14c102c91b6a1415588d7b2416f6d8a8bce04837dd07d023</cites><orcidid>0000-0001-6424-4782 ; 0000-0002-5721-1242 ; 0000-0001-9545-327X ; 0000-0002-2463-4396 ; 0000-0002-3395-7568 ; 0000-0001-7723-2262 ; 0000-0002-9589-294X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-022-12091-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-022-12091-z$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Sánchez-Caballero, Adrián</creatorcontrib><creatorcontrib>de López-Diz, Sergio</creatorcontrib><creatorcontrib>Fuentes-Jimenez, David</creatorcontrib><creatorcontrib>Losada-Gutiérrez, Cristina</creatorcontrib><creatorcontrib>Marrón-Romera, Marta</creatorcontrib><creatorcontrib>Casillas-Pérez, David</creatorcontrib><creatorcontrib>Sarker, Mohammad Ibrahim</creatorcontrib><title>3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people’s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.</description><subject>Artificial neural networks</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Human activity recognition</subject><subject>Human motion</subject><subject>Multimedia Information Systems</subject><subject>Neural networks</subject><subject>Object recognition</subject><subject>Real time</subject><subject>Special Purpose and Application-Based Systems</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kMFOwzAQRC0EEqXwA5wicTbs2nHsckOFAlJVLnBElhM7JaVNgp2ool-P2yBx47Sj3Xmz0hByiXCNAPImIELKKDBGkcEE6e6IjFBITqVkeBw1V0ClADwlZyGsADATLB2Rd34_my4Wt4l3Zk27auMSU3RVU8dF0Szr6qD7UNXLhN8n1rk2qV3vzTqObtv4z5Bsq-4j8WYbr21UVV02fmP24Dk5Kc06uIvfOSZvs4fX6ROdvzw-T-_mtOAZ76gRiM5aa5hTqTMsx7RAYMUE88xgikIoZWXOUszKzCqj8sJBqri0FqQFxsfkashtffPVu9DpVdP7Or7ULFNMSIViEl1scBW-CcG7Ure-2hj_rRH0vkY91KhjjfpQo95FiA9QiOZ66fxf9D_UD1Kwdao</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Sánchez-Caballero, Adrián</creator><creator>de López-Diz, Sergio</creator><creator>Fuentes-Jimenez, David</creator><creator>Losada-Gutiérrez, Cristina</creator><creator>Marrón-Romera, Marta</creator><creator>Casillas-Pérez, David</creator><creator>Sarker, Mohammad Ibrahim</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-6424-4782</orcidid><orcidid>https://orcid.org/0000-0002-5721-1242</orcidid><orcidid>https://orcid.org/0000-0001-9545-327X</orcidid><orcidid>https://orcid.org/0000-0002-2463-4396</orcidid><orcidid>https://orcid.org/0000-0002-3395-7568</orcidid><orcidid>https://orcid.org/0000-0001-7723-2262</orcidid><orcidid>https://orcid.org/0000-0002-9589-294X</orcidid></search><sort><creationdate>20220701</creationdate><title>3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information</title><author>Sánchez-Caballero, Adrián ; de López-Diz, Sergio ; Fuentes-Jimenez, David ; Losada-Gutiérrez, Cristina ; Marrón-Romera, Marta ; Casillas-Pérez, David ; Sarker, Mohammad Ibrahim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-a511eddda2e84ea2b14c102c91b6a1415588d7b2416f6d8a8bce04837dd07d023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial neural networks</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Human activity recognition</topic><topic>Human motion</topic><topic>Multimedia Information Systems</topic><topic>Neural networks</topic><topic>Object recognition</topic><topic>Real time</topic><topic>Special Purpose and Application-Based Systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sánchez-Caballero, Adrián</creatorcontrib><creatorcontrib>de López-Diz, Sergio</creatorcontrib><creatorcontrib>Fuentes-Jimenez, David</creatorcontrib><creatorcontrib>Losada-Gutiérrez, Cristina</creatorcontrib><creatorcontrib>Marrón-Romera, Marta</creatorcontrib><creatorcontrib>Casillas-Pérez, David</creatorcontrib><creatorcontrib>Sarker, Mohammad Ibrahim</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>ProQuest Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sánchez-Caballero, Adrián</au><au>de López-Diz, Sergio</au><au>Fuentes-Jimenez, David</au><au>Losada-Gutiérrez, Cristina</au><au>Marrón-Romera, Marta</au><au>Casillas-Pérez, David</au><au>Sarker, Mohammad Ibrahim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2022-07-01</date><risdate>2022</risdate><volume>81</volume><issue>17</issue><spage>24119</spage><epage>24143</epage><pages>24119-24143</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people’s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-022-12091-z</doi><tpages>25</tpages><orcidid>https://orcid.org/0000-0001-6424-4782</orcidid><orcidid>https://orcid.org/0000-0002-5721-1242</orcidid><orcidid>https://orcid.org/0000-0001-9545-327X</orcidid><orcidid>https://orcid.org/0000-0002-2463-4396</orcidid><orcidid>https://orcid.org/0000-0002-3395-7568</orcidid><orcidid>https://orcid.org/0000-0001-7723-2262</orcidid><orcidid>https://orcid.org/0000-0002-9589-294X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1380-7501
ispartof	Multimedia tools and applications, 2022-07, Vol.81 (17), p.24119-24143
issn	1380-7501 1573-7721
language	eng
recordid	cdi_proquest_journals_2682578159
source	SpringerLink Journals
subjects	Artificial neural networks Computer Communication Networks Computer Science Data Structures and Information Theory Human activity recognition Human motion Multimedia Information Systems Neural networks Object recognition Real time Special Purpose and Application-Based Systems
title	3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T03%3A22%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=3DFCNN:%20real-time%20action%20recognition%20using%203D%20deep%20neural%20networks%20with%20raw%20depth%20information&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=S%C3%A1nchez-Caballero,%20Adri%C3%A1n&rft.date=2022-07-01&rft.volume=81&rft.issue=17&rft.spage=24119&rft.epage=24143&rft.pages=24119-24143&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-022-12091-z&rft_dat=%3Cproquest_cross%3E2682578159%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2682578159&rft_id=info:pmid/&rfr_iscdi=true